Some simple understandings of the large language model

Some simple understandings of the large language model

The effect of the ChatGPT is stunning enough, GPT4 provides even more accurate and amazing capabilities. The entire IT industry is currently experiencing a shake-up due to the large models. Every company is trying to develop its LLM to keep up with the tech trend.

It makes me curious: What is the difference between LLM and deep learning that we knew in the past? Here’s some of my rough understanding.

Read more
如何理解1x1卷积?1x1卷积为什么可以降维?

如何理解1x1卷积?1x1卷积为什么可以降维?

1x1卷积就是卷积核大小为1x1的卷积操作。1x1卷积操作的目的很明确:就是变维(这里的维度是指channels维度的数量,而不是增加或者减少特征维度,这个计算机领域的遣词还是要多打磨啊,不然会很误导)。因此1x1卷积也可以被看作一种全连接层,因为达到了类似的操作和类似的效果。这里对1x1操作具体地进行分析,以助理解。

Read more
Vision Transformer论文解读

Vision Transformer论文解读

发表于ICLR2021的Vision Transformer已经成为后续Transformer模型在CV领域进一步发展的基石,本文为初代Vision Transformer论文An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale的解读。

Vision Transformer的pytorch实现可以看我放在Github里的实现:Vision Transformer pytorch - Github,欢迎前来star&fork✨✨✨

Read more

ModuleNotFoundError: No module named 'flaxformer'

✨Check the date of this article before reading!✨

No module named ‘flaxformer’

Recently, Google used more and more models built on JAX and Flax. When I try to run Vision Transformer pretrained models created by Google, I got the following error.

1
ModuleNotFoundError: No module named 'flaxformer'

But I already installed the requirements.txt, I was confused about the error. And the flaxformer can not be installed by using pip install [package name] since it did not release on PyPI.

Read more