Blog - Hongwei Zhao

Untitled

2026-02-04 数学笔记

<h2 id="数学知识">数学知识<a class="anchor-link" href="#数学知识" title="Permanent link">¶</a></h2> <p>可导：即设<span class="mat...

Untitled

2026-02-04 数学笔记/02.线性代数

<h2 id="求逆矩阵">求逆矩阵<a class="anchor-link" href="#求逆矩阵" title="Permanent link">¶</a></h2> <p>我们从逆矩阵开始，对于二阶矩阵有<span...

Untitled

2026-02-04 数学笔记/02.线性代数

<h2 id="图像压缩">图像压缩<a class="anchor-link" href="#图像压缩" title="Permanent link">¶</a></h2> <p>本讲我们介绍一种图片有损压缩的一种方法：JPEG...

Untitled

2026-02-04 数学笔记/02.线性代数

<h2 id="图和网络">图和网络<a class="anchor-link" href="#图和网络" title="Permanent link">¶</a></h2> <pre><code class="langua...

Untitled

2026-02-04 数学笔记/02.线性代数

<p>对于<span class="math-inline">m \times n</span>矩阵<span class="math-inline">A</span>，<span class="math-inline">ran...

Untitled

2026-02-04 数学笔记/02.线性代数

<p>上一讲中，我们从三个简单的性质扩展出了一些很好的推论，本讲将继续使用这三条基本性质：</p> <ol> <li><span class="math-inline">\det I=1</span>；</li> &...

Untitled

2026-02-04 数学笔记/02.线性代数

<p>本讲我们介绍将一个矩阵写为<span class="math-inline">[Math Processing Error]A=U\varSigma V^T</span>，分解的因子分别为正交矩阵、对角矩阵、正交矩阵，与前面几讲的分解不同的是，这两个正交矩阵...

Untitled

2026-02-04 数学笔记/02.线性代数

<h2 id="微分方程fracmathrmdumathrmdtau">微分方程<span class="math-inline">\frac{\mathrm{d}u}{\mathrm{d}t}=Au</span><a class="anchor-link"...

第二讲：矩阵消元

2026-02-04 数学笔记/02.线性代数

<h1 id="第二讲矩阵消元">第二讲：矩阵消元<a class="anchor-link" href="#第二讲矩阵消元" title="Permanent link">¶</a></h1> <p>这个方法最早由高斯提出，我们...

Untitled

2026-02-04 数学笔记/02.线性代数

<h2 id="对角化矩阵">对角化矩阵<a class="anchor-link" href="#对角化矩阵" title="Permanent link">¶</a></h2> <p>上一讲我们提到关键方程<span c...

第四讲：$A$ 的 $LU$ 分解

2026-02-04 数学笔记/02.线性代数

<h1 id="第四讲a-的-lu-分解">第四讲：<span class="math-inline">A</span> 的 <span class="math-inline">LU</span> 分解<a class="anchor...

Untitled

2026-02-04 数学笔记/02.线性代数

<p>上一讲中，我们知道了投影矩阵<span class="math-inline">P=A(A^TA)^{-1}A^T</span>，<span class="math-inline">Pb</span>将会把向量投影在<span ...

线性代数

2026-02-04 数学笔记/02.线性代数

<h1 id="线性代数">线性代数<a class="anchor-link" href="#线性代数" title="Permanent link">¶</a></h1> <h2 id="行列式">行列式<a class...

plt.plot([48/17], [12/17], 'o')

2026-02-04 数学笔记/02.线性代数

<p>从<span class="math-inline">\mathbb{R}^2</span>空间讲起，有向量<span class="math-inline">a, b</span>，做<span class="math-inl...

Untitled

2026-02-04 数学笔记/02.线性代数

<p>本讲我们会了解如何完整的测试一个矩阵是否正定，测试<span class="math-inline">x^TAx</span>是否具有最小值，最后了解正定的几何意义——椭圆（ellipse）和正定性有关，双曲线（hyperbola）与正定无关。另外，本讲涉及...

Untitled

2026-02-04 数学笔记/02.线性代数

<h2 id="对称矩阵">对称矩阵<a class="anchor-link" href="#对称矩阵" title="Permanent link">¶</a></h2> <p>前面我们学习了矩阵的特征值与特征向量，也了解了一...

第六讲：列空间和零空间

2026-02-04 数学笔记/02.线性代数

<h1 id="第六讲列空间和零空间">第六讲：列空间和零空间<a class="anchor-link" href="#第六讲列空间和零空间" title="Permanent link">¶</a></h1> <p>对向量子空...

Untitled

2026-02-04 数学笔记/02.线性代数

<p>如何判断一个操作是不是线性变换？线性变换需满足以下两个要求：</p> <p><div class="math-display"><br /> T(v+w)=T(v)+T(w)\<br /> T(cv)=cT(v)<b...

Untitled

2026-02-04 数学笔记/02.线性代数

<h2 id="乘法和逆矩阵">乘法和逆矩阵<a class="anchor-link" href="#乘法和逆矩阵" title="Permanent link">¶</a></h2> <p>上一讲大概介绍了矩阵乘法和逆矩阵，本...

Untitled

2026-02-04 数学笔记/02.线性代数

<p>前面我们涉及到的逆（inverse）都是指左、右乘均成立的逆矩阵，即<span class="math-inline">A^{-1}A=I=AA^{-1}</span>。在这种情况下，<span class="math-inline">m\tim...

Untitled

2026-02-04 数学笔记/02.线性代数

<h2 id="转换置换向量空间r">转换、置换、向量空间R<a class="anchor-link" href="#转换置换向量空间r" title="Permanent link">¶</a></h2> <h2 id="置换矩阵p...

第七讲：求解$Ax=0$，主变量，特解

2026-02-04 数学笔记/02.线性代数

<h1 id="第七讲求解ax0主变量特解">第七讲：求解<span class="math-inline">Ax=0</span>，主变量，特解<a class="anchor-link" href="#第七讲求解ax0主变量特解" title="Perm...

Untitled

2026-02-04 数学笔记/02.线性代数

<h2 id="马尔科夫矩阵">马尔科夫矩阵<a class="anchor-link" href="#马尔科夫矩阵" title="Permanent link">¶</a></h2> <p>马尔科夫矩阵（Markov matr...

Untitled

2026-02-04 数学笔记/02.线性代数

<p>行列式（determinant）的性质：</p> <ol> <li> <p><span class="math-inline">\det{I}=1</span>，单位矩阵行列式值为一。</p> &l...

Untitled

2026-02-04 数学笔记/02.线性代数

<h2 id="特征值特征向量的由来">特征值、特征向量的由来<a class="anchor-link" href="#特征值特征向量的由来" title="Permanent link">¶</a></h2> <p>给定矩阵&...

Untitled

2026-02-04 数学笔记/02.线性代数

<p>在四个基本子空间中，提到对于秩为<span class="math-inline">r</span>的<span class="math-inline">m \times n</span>矩阵，其行空间（<span class=...

Untitled

2026-02-04 数学笔记/02.线性代数

<h2 id="标准正交矩阵">标准正交矩阵<a class="anchor-link" href="#标准正交矩阵" title="Permanent link">¶</a></h2> <p>定义标准正交向量（orthonorm...

Untitled

2026-02-04 数学笔记/02.线性代数

<p><span class="math-inline">v_1,\ v_2,\ \cdots,\ v_n</span>是<span class="math-inline">m\times n</span>矩阵<span class=...

第一讲：方程组的几何解释

2026-02-04 数学笔记/02.线性代数

<h1 id="第一讲方程组的几何解释">第一讲：方程组的几何解释<a class="anchor-link" href="#第一讲方程组的几何解释" title="Permanent link">¶</a></h1> <p>我们...

Untitled

2026-02-04 数学笔记/02.线性代数

<h2 id="对称矩阵">对称矩阵<a class="anchor-link" href="#对称矩阵" title="Permanent link">¶</a></h2> <p>前面我们学习了矩阵的特征值与特征向量，也了解了一...

Untitled

2026-02-04 数学笔记/02.线性代数

<p>本讲主要介绍复数向量、复数矩阵的相关知识（包括如何做复数向量的点积运算、什么是复数对称矩阵等），以及傅里叶矩阵（最重要的复数矩阵）和快速傅里叶变换。</p> <h2 id="复数矩阵运算">复数矩阵运算<a class="anchor-link" hr...

Untitled

2026-02-04 数学笔记/02.线性代数

<h2 id="矩阵空间秩1矩阵和小世界图">矩阵空间、秩1矩阵和小世界图<a class="anchor-link" href="#矩阵空间秩1矩阵和小世界图" title="Permanent link">¶</a></h2> <h...

Untitled

2026-02-04 数学笔记/02.线性代数

<p>在本讲的开始，先接着上一讲来继续说一说正定矩阵。</p> <ul> <li> <p>正定矩阵的逆矩阵有什么性质？我们将正定矩阵分解为<span class="math-inline">A=S\Lambda S^{-1}&l...

第八讲：求解$Ax=b$：可解性和解的结构

2026-02-04 数学笔记/02.线性代数

<h1 id="第八讲求解axb可解性和解的结构">第八讲：求解<span class="math-inline">Ax=b</span>：可解性和解的结构<a class="anchor-link" href="#第八讲求解axb可解性和解的结构" tit...

Untitled

2026-02-04 数学笔记/05.分布

<h2 id="信息量">信息量<a class="anchor-link" href="#信息量" title="Permanent link">¶</a></h2> <p>假设 <span class="math-inl...

先验分布与后验分布

2026-02-04 数学笔记/05.分布

<h2 id="1回顾贝叶斯定理">1.回顾贝叶斯定理<a class="anchor-link" href="#1回顾贝叶斯定理" title="Permanent link">¶</a></h2> <p>首先，我们先来复习一下...

Untitled

2026-02-04 数学笔记/05.分布

<h2 id="统计量及其分布">统计量及其分布<a class="anchor-link" href="#统计量及其分布" title="Permanent link">¶</a></h2> <h3 id="总体与样本"><...

模拟 logits 和真实标签

2026-02-04 数学笔记/05.分布

<h2 id="交叉熵">交叉熵<a class="anchor-link" href="#交叉熵" title="Permanent link">¶</a></h2> <p>假设有两个分布<span class="math...

Untitled

2026-02-04 数学笔记/05.分布

<h2 id="梯度流">梯度流<a class="anchor-link" href="#梯度流" title="Permanent link">¶</a></h2> <h3 id="欧式空间">欧式空间<a class=...

Untitled

2026-02-04 数学笔记/05.分布

<h2 id="推土机距离问题earth-movers-distance">推土机距离问题（Earth Mover's Distance）<a class="anchor-link" href="#推土机距离问题earth-movers-distance" title="Perma...

Untitled

2026-02-04 数学笔记/05.分布

<h2 id="六假设检验">六、假设检验<a class="anchor-link" href="#六假设检验" title="Permanent link">¶</a></h2> <h3 id="61-假设检验的基本思想和概念"&g...

Untitled

2026-02-04 数学笔记/05.分布

<p>统计学与概率论的区别就是归纳和演绎，前者通过样本推测总体的分布，而后者已知总体分布去研究样本。因此参数估计则是归纳的过程，参数估计有两种形式：<strong>点估计</strong>和<strong>区间估计</strong><...

Untitled

2026-02-04 数学笔记/05.分布

<blockquote> <p><a href="https://zhuanlan.zhihu.com/p/387938179">两个多元正态分布的KL散度、巴氏距离和W距离</a></p> </blockquote> <...

Untitled

2026-02-04 数学笔记/05.分布

<h2 id="wgan">WGAN<a class="anchor-link" href="#wgan" title="Permanent link">¶</a></h2> <p>假设 <span class="math-...

Untitled

2026-02-04 数学笔记/05.分布

<blockquote> <p>文章来源：<a href="https://zhuanlan.zhihu.com/p/639733453">Optimal Transport的前世今生 | (一) 从Monge问题到Kantorovich问题</a>&...

Untitled

2026-02-04 数学笔记/05.分布

<blockquote> <p><a href="https://zhuanlan.zhihu.com/p/662721431">EMO：基于最优传输思想设计的分类损失函数</a></p> </blockquote> <p...

Untitled

2026-02-04 数学笔记/04.矩阵

...

低秩近似之路（三）：CR

2026-02-04 数学笔记/04.矩阵

<h1 id="低秩近似之路三cr">低秩近似之路（三）：CR<a class="anchor-link" href="#低秩近似之路三cr" title="Permanent link">¶</a></h1> <p><st...

Untitled

2026-02-04 数学笔记/04.矩阵

<blockquote> <p><a href="https://so.csdn.net/so/search?q=%E7%90%86%E8%A7%A3%E7%9F%A9%E9%98%B5&t=blog&u=myan">理解矩阵（转载自孟岩）<...

低秩近似之路（二）：SVD

2026-02-04 数学笔记/04.矩阵

<h1 id="低秩近似之路二svd">低秩近似之路（二）：SVD<a class="anchor-link" href="#低秩近似之路二svd" title="Permanent link">¶</a></h1> <p><...

低秩近似之路（一）：伪逆

2026-02-04 数学笔记/04.矩阵

<h1 id="低秩近似之路一伪逆">低秩近似之路（一）：伪逆<a class="anchor-link" href="#低秩近似之路一伪逆" title="Permanent link">¶</a></h1> <p><st...

Untitled

2026-02-04 数学笔记/04.矩阵

<blockquote> <p>文章来源：<a href="https://zhuanlan.zhihu.com/p/714259385">Monarch矩阵-计算高效的稀疏型矩阵分解</a> <br><br /> 最佳排版请看...

Untitled

2026-02-04 数学笔记/04.矩阵

<h2 id="新理解矩阵1矩阵是什么"><a href="https://spaces.ac.cn/archives/1765">《新理解矩阵1》：矩阵是什么？</a><a class="anchor-link" href="#新理解矩阵1矩阵是什么" t...

Untitled

2026-02-04 数学笔记/03.概率论与数理统计

<h2 id="随机变量及其分布">随机变量及其分布<a class="anchor-link" href="#随机变量及其分布" title="Permanent link">¶</a></h2> <h3 id="随机变量的概念"&g...

Untitled

2026-02-04 数学笔记/03.概率论与数理统计

<h2 id="三随机变量的数字特征">三、随机变量的数字特征<a class="anchor-link" href="#三随机变量的数字特征" title="Permanent link">¶</a></h2> <h3 id="31-...

Untitled

2026-02-04 数学笔记/03.概率论与数理统计

<h2 id="一事件与概率">一、事件与概率<a class="anchor-link" href="#一事件与概率" title="Permanent link">¶</a></h2> <h3 id="11-随机试验和随机事件"&g...

ES6 语法

2026-02-04 ProgramNotes

<h1 id="es6-语法">ES6 语法<a class="anchor-link" href="#es6-语法" title="Permanent link">¶</a></h1> <p>ECMAScript 6.0（以下简...

Untitled

2026-02-04 ProgramNotes

<h2 id="编码">编码<a class="anchor-link" href="#编码" title="Permanent link">¶</a></h2> <ol> <li> <p>最早只有127个...

Reference

2026-02-04 ProgramNotes

<h1 id="reference">Reference<a class="anchor-link" href="#reference" title="Permanent link">¶</a></h1> <ul> <li&...

Untitled

2026-02-04 ProgramNotes

<h2 id="1maven-简介">1.Maven 简介<a class="anchor-link" href="#1maven-简介" title="Permanent link">¶</a></h2> <p>Maven 的本...

把本地的 /path/to/local/file 文件传输到远程的 /path/to/remote/file

2026-02-04 ProgramNotes

<ul> <li>https://zhuanlan.zhihu.com/p/21999778</li> <li>https://blog.csdn.net/li528405176/article/details/82810342</li> ...

Untitled

2026-02-04 ProgramNotes

<h2 id="快捷键">快捷键<a class="anchor-link" href="#快捷键" title="Permanent link">¶</a></h2> <h2 id="mac-键盘符号说明">Mac 键盘符号说明...

参考文章

2026-02-04 ProgramNotes

<h2 id="软件设计的整体流程">软件设计的整体流程<a class="anchor-link" href="#软件设计的整体流程" title="Permanent link">¶</a></h2> <ul> <li&...

Untitled

2026-02-04 ProgramNotes

<h2 id="分布式文件系统">分布式文件系统<a class="anchor-link" href="#分布式文件系统" title="Permanent link">¶</a></h2> <h2 id="什么是分布式文件系统"&g...

Zotero+坚果云搞定多设备文献管理

2026-02-04 ProgramNotes

<h1 id="zotero坚果云搞定多设备文献管理">Zotero+坚果云搞定多设备文献管理<a class="anchor-link" href="#zotero坚果云搞定多设备文献管理" title="Permanent link">¶</a>&l...

Untitled

2026-02-04 ProgramNotes

<h2 id="云平台核心">云平台核心<a class="anchor-link" href="#云平台核心" title="Permanent link">¶</a></h2> <h3 id="为什么用云平台">为什么用云平台...

Untitled

2026-02-04 ProgramNotes

<p>学习编程其实就是学高级语言，即那些为人类设计的计算机语言。</p> <p>但是，计算机不理解高级语言，必须通过编译器转成二进制代码，才能运行。学会高级语言，并不等于理解计算机实际的运行步骤。</p> <p><img alt="i...

git commit 将暂存区的文件修改提交到本地仓库

2026-02-04 ProgramNotes/90.Git

<p><img src="https://markdownimg-hw.oss-cn-beijing.aliyuncs.com/image202211042053980.png"/></p> <hr/> <blockquote> <p...

查看 Linux 命令帮助信息

2026-02-04 ProgramNotes/01.Linux

<h1 id="查看-linux-命令帮助信息">查看 Linux 命令帮助信息<a class="anchor-link" href="#查看-linux-命令帮助信息" title="Permanent link">¶</a></h1> ...

scp

2026-02-04 ProgramNotes/01.Linux

<h1 id="scp">scp<a class="anchor-link" href="#scp" title="Permanent link">¶</a></h1> <p>加密的方式在本地主机和远程主机之间复制文件</p...

Linux 文件内容查看编辑

2026-02-04 ProgramNotes/01.Linux

<h1 id="linux-文件内容查看编辑">Linux 文件内容查看编辑<a class="anchor-link" href="#linux-文件内容查看编辑" title="Permanent link">¶</a></h1> <...

Untitled

2026-02-04 ProgramNotes/01.Linux

<h2 id="学习资源">学习资源<a class="anchor-link" href="#学习资源" title="Permanent link">¶</a></h2> <ul> <li><a href="...

命令行的艺术

2026-02-04 ProgramNotes/01.Linux

<blockquote> <p>转载自 https://github.com/jlevy/the-art-of-command-line</p> </blockquote> <p><em><a href="README-c...

free

2026-02-04 ProgramNotes/01.Linux

<h1 id="free">free<a class="anchor-link" href="#free" title="Permanent link">¶</a></h1> <p>显示内存的使用情况</p> <...

top

2026-02-04 ProgramNotes/01.Linux

<h1 id="top">top<a class="anchor-link" href="#top" title="Permanent link">¶</a></h1> <p>显示或管理执行中的程序</p> <h...

Linux 系统管理

2026-02-04 ProgramNotes/01.Linux

<h1 id="linux-系统管理">Linux 系统管理<a class="anchor-link" href="#linux-系统管理" title="Permanent link">¶</a></h1> <blockquote&...

iotop

2026-02-04 ProgramNotes/01.Linux

<h1 id="iotop">iotop<a class="anchor-link" href="#iotop" title="Permanent link">¶</a></h1> <p>用来监视磁盘 I/O 使用状况的工具<...

Linux 硬件管理

2026-02-04 ProgramNotes/01.Linux

<h1 id="linux-硬件管理">Linux 硬件管理<a class="anchor-link" href="#linux-硬件管理" title="Permanent link">¶</a></h1> <blockquote&...

grep

2026-02-04 ProgramNotes/01.Linux

<h1 id="grep">grep<a class="anchor-link" href="#grep" title="Permanent link">¶</a></h1> <p>强大的文本搜索工具</p> <...

新建用户加入组

2026-02-04 ProgramNotes/01.Linux

<blockquote> <p>关键词：<code>groupadd</code>, <code>groupdel</code>, <code>groupmod</code>, <code>u...

Untitled

2026-02-04 ProgramNotes/01.Linux

<h2 id="linux命令">Linux命令<a class="anchor-link" href="#linux命令" title="Permanent link">¶</a></h2> <ul> <li>查看 ...

vmstat

2026-02-04 ProgramNotes/01.Linux

<h1 id="vmstat">vmstat<a class="anchor-link" href="#vmstat" title="Permanent link">¶</a></h1> <p>显示虚拟内存状态</p>...

Linux 网络管理

2026-02-04 ProgramNotes/01.Linux

<h1 id="linux-网络管理">Linux 网络管理<a class="anchor-link" href="#linux-网络管理" title="Permanent link">¶</a></h1> <blockquote&...

Linux 文件目录管理

2026-02-04 ProgramNotes/01.Linux

<h1 id="linux-文件目录管理">Linux 文件目录管理<a class="anchor-link" href="#linux-文件目录管理" title="Permanent link">¶</a></h1> <block...

iostat

2026-02-04 ProgramNotes/01.Linux

<h1 id="iostat">iostat<a class="anchor-link" href="#iostat" title="Permanent link">¶</a></h1> <p>监视系统输入输出设备和 CPU 的使...

Linux 软件管理

2026-02-04 ProgramNotes/01.Linux

<h1 id="linux-软件管理">Linux 软件管理<a class="anchor-link" href="#linux-软件管理" title="Permanent link">¶</a></h1> <blockquote&...

Linux 文件压缩和解压

2026-02-04 ProgramNotes/01.Linux

<h1 id="linux-文件压缩和解压">Linux 文件压缩和解压<a class="anchor-link" href="#linux-文件压缩和解压" title="Permanent link">¶</a></h1> <bl...

Untitled

2026-02-04 Others/01.mac

<h2 id="homebrew">Homebrew<a class="anchor-link" href="#homebrew" title="Permanent link">¶</a></h2> <p>.bash_profil...

Untitled

2026-02-04 Others/08.GitHub

<h2 id="accuracy">accuracy<a class="anchor-link" href="#accuracy" title="Permanent link">¶</a></h2> <p>title：Questi...

Untitled

2026-02-04 Others/08.GitHub

<p>TODO</p> <blockquote> <p>Github Action 官方文档：https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-a...

创建挂载目录

2026-02-04 Others/06.实用工具

<p><img alt="image-20231103170538800" src="https://markdownimg-hw.oss-cn-beijing.aliyuncs.com/image202311031705837.png" /></p> <h...

Untitled

2026-02-04 Others/06.实用工具

<h2 id="图像">图像<a class="anchor-link" href="#图像" title="Permanent link">¶</a></h2> <div align=center><img src=""/...

Untitled

2026-02-04 Others/06.实用工具

<h2 id="表-1-数学模式重音符">表 1: 数学模式重音符<a class="anchor-link" href="#表-1-数学模式重音符" title="Permanent link">¶</a></h2> <table&g...

Untitled

2026-02-04 Others/06.实用工具

<h2 id="readme-模板">README 模板<a class="anchor-link" href="#readme-模板" title="Permanent link">¶</a></h2> <div align=cent...

查看当前版本

2026-02-04 Others/04.ai

<h2 id="conda">conda<a class="anchor-link" href="#conda" title="Permanent link">¶</a></h2> <p>conda是一个包，依赖和环境管理工具，适...

Untitled

2026-02-04 Others/04.ai

<h2 id="cuda安装多版本切换">cuda安装（多版本切换）<a class="anchor-link" href="#cuda安装多版本切换" title="Permanent link">¶</a></h2> <p>除...

Untitled

2026-02-04 Others/04.ai

<h2 id="pytorch">PyTorch<a class="anchor-link" href="#pytorch" title="Permanent link">¶</a></h2> <p>在开发过程中可能会有多个项目同...

Untitled

2026-02-04 Others/07.NAS

<h2 id="nas网络篇">NAS网络篇<a class="anchor-link" href="#nas网络篇" title="Permanent link">¶</a></h2> <p>假设把NAS搭建在家中，家庭成员们在...

Untitled

2026-02-04 Others/07.NAS

<h2 id="nas硬盘篇">NAS硬盘篇<a class="anchor-link" href="#nas硬盘篇" title="Permanent link">¶</a></h2> <p>目前市面上的硬盘主要分为企业盘、NA...

Untitled

2026-02-04 Others/07.NAS

<h2 id="nas介绍篇">NAS介绍篇<a class="anchor-link" href="#nas介绍篇" title="Permanent link">¶</a></h2> <p>NAS 可以看作“简化版”“私人”存...

Untitled

2026-02-04 Others/07.NAS

<h2 id="nas玩法篇">NAS玩法篇<a class="anchor-link" href="#nas玩法篇" title="Permanent link">¶</a></h2> <h3 id="家庭影院">家庭影院<...

Untitled

2026-02-04 Others/07.NAS

<h2 id="nas选购篇">NAS选购篇<a class="anchor-link" href="#nas选购篇" title="Permanent link">¶</a></h2> <p>本部分有每个市面上在售NAS产品型号...

创建挂载目录

2026-02-04 Others/03.linux

<h2 id="anaconda">Anaconda<a class="anchor-link" href="#anaconda" title="Permanent link">¶</a></h2> <ol> <li>...

Untitled

2026-02-04 Others/09.Tex

<h3 id="2-table-环境"><strong>2. <code>table</code> 环境</strong><a class="anchor-link" href="#2-table-环境" title="Permane...

Untitled

2026-02-04 Others/09.Tex

<h3 id="1-figure-环境"><strong>1. <code>figure</code> 环境</strong><a class="anchor-link" href="#1-figure-环境" title="Perm...

Untitled

2026-02-04 Others/09.Tex

<ol> <li><strong>文档结构</strong>：</li> </ol> <ul> <li>使用<code>\documentclass</code>命令设置文档类型，...

Untitled

2026-02-04 Others/05.IDE

<h2 id="mac-中-idea-快捷键">Mac 中 IDEA 快捷键<a class="anchor-link" href="#mac-中-idea-快捷键" title="Permanent link">¶</a></h2> <...

Untitled

2026-02-04 Others/05.IDE

<h2 id="pycharm-远程连接服务器">pycharm 远程连接服务器<a class="anchor-link" href="#pycharm-远程连接服务器" title="Permanent link">¶</a></h2> ...

Untitled

2026-02-04 Others/02.win

<h2 id="anaconda">Anaconda<a class="anchor-link" href="#anaconda" title="Permanent link">¶</a></h2> <ul> <li>...

Untitled

2026-02-04 Others/02.win

<div align=center><img src="https://markdownimg-hw.oss-cn-beijing.aliyuncs.com/20241027212237.png" style="zoom: 60%;" /></div>...

Untitled

2026-02-04 大模型

<div align=center><img src="https://markdownimg-hw.oss-cn-beijing.aliyuncs.com/20240528122659.png"/></div> <h3 id="为什么现在的-llm-都是...

简介

2026-02-04 大模型

<h2 id="基于大模型的智能体agent">基于大模型的智能体(Agent)<a class="anchor-link" href="#基于大模型的智能体agent" title="Permanent link">¶</a></h2> &...

Untitled

2026-02-04 大模型

<blockquote> <p><a href="https://zhuanlan.zhihu.com/p/667489780">解析大模型中的Scaling Law</a></p> </blockquote> <p>...

1. Llama进化史

2026-02-04 大模型/05.常见模型篇

<p>TODO</p> <blockquote> <p>LLAMA(Large Language Model Meta AI)</p> </blockquote> <h2 id="llama1">LLaMA1<...

Untitled

2026-02-04 大模型/05.常见模型篇

<h2 id="chatglm-6b">ChatGLM-6B<a class="anchor-link" href="#chatglm-6b" title="Permanent link">¶</a></h2> <blockquote&...

Untitled

2026-02-04 大模型/00.前置篇

<h2 id="实战">实战<a class="anchor-link" href="#实战" title="Permanent link">¶</a></h2> <h3 id="self-llm">self-llm<a c...

Untitled

2026-02-04 大模型/04.微调篇

...

Untitled

2026-02-04 大模型/04.微调篇

...

Untitled

2026-02-04 大模型/04.微调篇

...

第7章大模型之Adaptation

2026-02-04 大模型/04.微调篇

<h1 id="第7章-大模型之adaptation">第7章大模型之Adaptation<a class="anchor-link" href="#第7章-大模型之adaptation" title="Permanent link">¶</a><...

Untitled

2026-02-04 大模型/04.微调篇

<p>大模型微调的三个节点：</p> <ul> <li><strong>节点 1 ChatGPT</strong>：由于 ChatGPT 惊人的效果，让大家意识到 AGI 的可能性，并重视起了大模型+开放指令微调+强化学习这种三...

[提示工程、RAG和微调 - 哪个才是大模型应用优化的最佳路径？](https://mp.weixin.qq.com/s/muOs95RFVQlr-WuTJ8tT3w)

2026-02-04 大模型/04.微调篇

<h1 id="提示工程rag和微调---哪个才是大模型应用优化的最佳路径"><a href="https://mp.weixin.qq.com/s/muOs95RFVQlr-WuTJ8tT3w">提示工程、RAG和微调 - 哪个才是大模型应用优化的最佳路径？</a&g...

【LLM】从零开始训练大模型

2026-02-04 大模型/03.训练篇

<h1 id="llm从零开始训练大模型">【LLM】从零开始训练大模型<a class="anchor-link" href="#llm从零开始训练大模型" title="Permanent link">¶</a></h1> <p&g...

-------------------------------------------------------------------------------

2026-02-04 大模型/03.训练篇

<p><a href="https://zhuanlan.zhihu.com/p/681692152">图解大模型训练系列之：DeepSpeed-Megatron MoE并行训练（源码解读篇）</a></p> <p>大家好，赶在节前把MoE...

Untitled

2026-02-04 大模型/03.训练篇

<h2 id="大模型框架分类整理">大模型框架分类整理<a class="anchor-link" href="#大模型框架分类整理" title="Permanent link">¶</a></h2> <h3 id="一训练框架"&...

Untitled

2026-02-04 大模型/03.训练篇

<blockquote> <p><a href="https://zhuanlan.zhihu.com/p/20329244481">大模型精度：FP32、TF32、FP16、BF16、FP8、FP4、NF4、INT8</a></p> &l...

Untitled

2026-02-04 大模型/03.训练篇

<h2 id="1-预训练阶段pretraining-stage">1. 预训练阶段（Pretraining Stage）<a class="anchor-link" href="#1-预训练阶段pretraining-stage" title="Permanent link"&g...

第8章分布式训练

2026-02-04 大模型/03.训练篇

<h1 id="第8章-分布式训练">第8章分布式训练<a class="anchor-link" href="#第8章-分布式训练" title="Permanent link">¶</a></h1> <h2 id="81-为什么分...

Untitled

2026-02-04 大模型/03.训练篇

<h2 id="大语言模型背后的数据">大语言模型背后的数据<a class="anchor-link" href="#大语言模型背后的数据" title="Permanent link">¶</a></h2> <p>我们要清楚，...

Untitled

2026-02-04 大模型/03.训练篇

<p>以下是目前常见大模型在不同训练阶段的方案总结及对应阶段的样例数据：</p> <hr /> <h3 id="1-预训练阶段pretraining"><strong>1. 预训练阶段（Pretraining）</strong>...

Untitled

2026-02-04 大模型/07.实战篇

<blockquote> <p><a href="https://www.zhihu.com/question/648879790/answer/3504152602">目前有什么可以本地部署的大模型推荐?</a></p> </blo...

升级pip

2026-02-04 大模型/07.实战篇/101.Qwen2.5-7B-Instruct

<h2 id="环境准备">环境准备<a class="anchor-link" href="#环境准备" title="Permanent link">¶</a></h2> <p>本文基础环境如下：</p> <...

升级pip

2026-02-04 大模型/07.实战篇/101.Qwen2.5-7B-Instruct

<h2 id="环境准备">环境准备<a class="anchor-link" href="#环境准备" title="Permanent link">¶</a></h2> <p>本文基础环境如下：</p> <...

model_download.py

2026-02-04 大模型/07.实战篇/101.Qwen2.5-7B-Instruct

<h2 id="openai-o1-model-简介"><strong>OpenAI o1 model 简介</strong><a class="anchor-link" href="#openai-o1-model-简介" title="Permanent...

更换 pypi 源加速库的安装

2026-02-04 大模型/07.实战篇/101.Qwen2.5-7B-Instruct

<p>本节我们简要介绍如何基于 transformers、peft 等框架，对 Qwen2.5-7B-Instruct 模型进行 Lora 微调。Lora 是一种高效微调方法，深入了解其原理可参见博客：<a href="https://zhuanlan.zhihu.com/p/65...

model_download.py

2026-02-04 大模型/07.实战篇/101.Qwen2.5-7B-Instruct

<h2 id="vllm-简介">vLLM 简介<a class="anchor-link" href="#vllm-简介" title="Permanent link">¶</a></h2> <p><code>vLL...

升级 pip

2026-02-04 大模型/07.实战篇/101.Qwen2.5-7B-Instruct

<h2 id="环境准备">环境准备<a class="anchor-link" href="#环境准备" title="Permanent link">¶</a></h2> <pre><code>----------...

更换 pypi 源加速库的安装

2026-02-04 大模型/07.实战篇/101.Qwen2.5-7B-Instruct

<p>本节我们简要介绍基于 transformers、peft 等框架，使用 Qwen2.5-7B-Instruct 模型在<strong>中文法律问答数据集 DISC-Law-SFT</strong> 上进行Lora微调训练，同时使用 <a href="h...

Untitled

2026-02-04 大模型/02.架构篇

<blockquote> <p><a href="https://zhuanlan.zhihu.com/p/662498827">大模型推理加速：看图学KV Cache</a></p> </blockquote> <p&g...

Untitled

2026-02-04 大模型/02.架构篇

<h2 id="模型架构">模型架构<a class="anchor-link" href="#模型架构" title="Permanent link">¶</a></h2> <p>到目前为止，我们已经将语言模型定义为对词元序列的...

Untitled

2026-02-04 大模型/01.基础篇

<h2 id="token模型理解和处理的基本单位">Token：模型理解和处理的基本单位<a class="anchor-link" href="#token模型理解和处理的基本单位" title="Permanent link">¶</a></...

Untitled

2026-02-04 大模型/01.基础篇

<h2 id="大模型-ai-应用全栈开发知识体系">大模型 AI 应用全栈开发知识体系<a class="anchor-link" href="#大模型-ai-应用全栈开发知识体系" title="Permanent link">¶</a></h...

Untitled

2026-02-04 大模型/01.基础篇

<h2 id="分词">分词<a class="anchor-link" href="#分词" title="Permanent link">¶</a></h2> <p>语言模型 <span class="math-inli...

Untitled

2026-02-04 大模型/01.基础篇

...

Untitled

2026-02-04 大模型/06.大模型持续学习

...

Untitled

2026-02-04 大模型/06.大模型持续学习

...

Untitled

2026-02-04 大模型/06.大模型持续学习

...

Untitled

2026-02-04 AINotes/15.对比学习

<blockquote> <p>文章来源：<a href="https://zhuanlan.zhihu.com/p/634466306">对比学习在学啥？</a></p> </blockquote> <p>对比学习...

Untitled

2026-02-04 AINotes/09.MoE

<h2 id="multi-modal-gated-mixture-of-local-to-global-experts-for-dynamic-image-fusion"><a href="https://arxiv.org/abs/2302.01392">Multi-Mo...

Untitled

2026-02-04 AINotes/09.MoE

<p>随着 Mixtral 8x7B (<a href="https://mistral.ai/news/mixtral-of-experts/">announcement</a>, <a href="https://huggingface.co/mistr...

Untitled

2026-02-04 AINotes/09.MoE

<h2 id="st-moe-designing-stableand-transferable-sparse-expert-models">ST-MoE: Designing Stableand Transferable Sparse Expert Models<a class="...

这里我们假设定义n_embed为32， num_experts=4, top_k=2

2026-02-04 AINotes/09.MoE

<p><a href="https://zhuanlan.zhihu.com/p/701777558">从零实现一个MOE（专家混合模型）</a></p> <h2 id="什么是混合模型moe">什么是混合模型（MOE）<a clas...

Untitled

2026-02-04 AINotes/09.MoE

<h2 id="patch-level-routing-in-mixture-of-experts-is-provably-sample-efficient-for-convolutional-neural-networks">Patch-level Routing in Mixture...

Untitled

2026-02-04 AINotes/09.MoE

<h2 id="moe模型">MoE模型<a class="anchor-link" href="#moe模型" title="Permanent link">¶</a></h2> <table> <thead> &l...

Untitled

2026-02-04 AINotes/09.MoE

...

Untitled

2026-02-04 AINotes/09.MoE

<h2 id="one-student-knows-all-experts-know-from-sparse-to-dense">One Student Knows All Experts Know: From Sparse to Dense<a class="anchor-lin...

Untitled

2026-02-04 AINotes/09.MoE

<h2 id="scaling-vision-with-sparse-mixture-of-experts"><a href="https://arxiv.org/pdf/2106.05974.pdf">Scaling Vision with Sparse Mixture o...

Untitled

2026-02-04 AINotes/09.MoE

<h2 id="from-sparse-to-soft-mixtures-of-experts">From Sparse to Soft Mixtures of Experts<a class="anchor-link" href="#from-sparse-to-soft-mix...

use sinusoidal position embedding to encode time step (https://arxiv.org/abs/1706.03762)

2026-02-04 AINotes/06.扩散模型

<blockquote> <p><a href="https://zhuanlan.zhihu.com/p/535042237">生成扩散模型漫谈（一）：DDPM = 拆楼 + 建楼</a></p> </blockquote> ...

文生图模型之Stable Diffusion

2026-02-04 AINotes/06.扩散模型

<p><strong>Author:</strong> [Hao Bai]</p> <p><strong>Link:</strong> [https://www.zhihu.com/question/53601228...

Untitled

2026-02-04 AINotes/06.扩散模型

<p>说起生成模型，大家最容易想到的就是GAN，GAN是<strong>通过对抗训练实现的一种隐式生成模型</strong>。虽然GAN很强大，但其实还有很多与GAN不同的生成模型，最常见的就是基于<strong>最大化似然的模型</strong...

Untitled

2026-02-04 AINotes/06.扩散模型

<p><a href="https://zhuanlan.zhihu.com/p/11439249557">扩散模型 vs. 最优传输</a></p> <h2 id="references">References<a class="a...

扩散模型之DDIM

2026-02-04 AINotes/06.扩散模型

<h1 id="扩散模型之ddim">扩散模型之DDIM<a class="anchor-link" href="#扩散模型之ddim" title="Permanent link">¶</a></h1> <p><stron...

Untitled

2026-02-04 AINotes/08.PTM

<p>利用深度学习自动学习特征已经逐步取代了人工构建特征和统计方法。但其中一个关键问题是需要大量的数据，否则会因为参数过多过拟合。但是这个成本非常高昂，因此长久以来，我们都在研究一个关键问题：如何在有限数据下训练高效的深度学习模型？</p> <p>一个重要的里程碑...

Untitled

2026-02-04 AINotes/08.PTM

<h2 id="kimi全文翻译-arrow_down">Kimi全文翻译 :arrow_down:<a class="anchor-link" href="#kimi全文翻译-arrow_down" title="Permanent link">¶</a&g...

Untitled

2026-02-04 AINotes/03.循环神经网络

<h2 id="recurrent-neural-networks">Recurrent Neural Networks<a class="anchor-link" href="#recurrent-neural-networks" title="Permanent link"&g...

Untitled

2026-02-04 AINotes/03.循环神经网络

<h2 id="其他rnn">其他RNN<a class="anchor-link" href="#其他rnn" title="Permanent link">¶</a></h2> <div align=center><im...

Untitled

2026-02-04 AINotes/14.PEFT

<h2 id="propulsion-steering-llm-with-tiny-fine-tuning"><a href="https://arxiv.org/abs/2409.10927">Propulsion: Steering LLM with Tiny Fine-...

Untitled

2026-02-04 AINotes/14.PEFT/01.LoRA

<blockquote> <p>来源：<a href="https://kexue.fm/archives/10226">对齐全量微调！这是我看过最精彩的LoRA改进（一）</a></p> </blockquote> <p...

Untitled

2026-02-04 AINotes/14.PEFT/01.LoRA

<blockquote> <ul> <li>解读来源：<a href="https://kexue.fm/archives/10266">对齐全量微调！这是我看过最精彩的LoRA改进（二）</a></li> <li>...

------------------------------------

2026-02-04 AINotes/14.PEFT/01.LoRA

<h2 id="一全参数微调">一、全参数微调<a class="anchor-link" href="#一全参数微调" title="Permanent link">¶</a></h2> <div align=center>&l...

------------------------------------

2026-02-04 AINotes/14.PEFT/01.LoRA

<h2 id="一adalora在做一件什么事">一、AdaLoRA在做一件什么事<a class="anchor-link" href="#一adalora在做一件什么事" title="Permanent link">¶</a></h2>...

Untitled

2026-02-04 AINotes/14.PEFT/01.LoRA

<blockquote> <p><a href="https://arxiv.org/abs/2402.12354">《LoRA+: Efficient Low Rank Adaptation of Large Models》</a></p>...

Untitled

2026-02-04 AINotes/14.PEFT/01.LoRA

<h2 id="learning-attentional-mixture-ofloras-for-language-model-continual-learning">Learning Attentional Mixture ofLoRAs for Language Model Cont...

Untitled

2026-02-04 AINotes/14.PEFT/01.LoRA

<h2 id="kimi全文翻译-arrow_down">Kimi全文翻译 :arrow_down:<a class="anchor-link" href="#kimi全文翻译-arrow_down" title="Permanent link">¶</a&g...

Untitled

2026-02-04 AINotes/14.PEFT/01.LoRA

<h2 id="milora-effcient-mixture-of-low-rank-adaptation-for-large-language-models-fine-tuning">MiLoRA: Effcient Mixture of Low-Rank Adaptation fo...

Untitled

2026-02-04 AINotes/14.PEFT/01.LoRA/10.LoRAMoE

<h2 id="loramoe">LoRAMoE<a class="anchor-link" href="#loramoe" title="Permanent link">¶</a></h2> <h2 id="1-背景">1. 背...

Untitled

2026-02-04 AINotes/14.PEFT/01.LoRA/10.LoRAMoE

<h2 id="mixture-of-lora-experts"><a href="https://arxiv.org/abs/2404.13628">Mixture of LoRA Experts</a><a class="anchor-link" hre...

Untitled

2026-02-04 AINotes/14.PEFT/04.P-Tuning

...

Untitled

2026-02-04 AINotes/14.PEFT/02.PrefixTuning

<h2 id="prefix-tuning-optimizing-continuous-prompts-for-generationacl-2021">Prefix-Tuning: Optimizing Continuous Prompts for Generation（ACL 2021...

Untitled

2026-02-04 AINotes/14.PEFT/00.Survey

<h2 id="lst-ladder-side-tuning-for-parameter-and-memory-efficient-transfer-learning">LST: Ladder Side-Tuning for Parameter and Memory Efficient ...

Untitled

2026-02-04 AINotes/14.PEFT/00.Survey

<blockquote> <p><a href="https://github.com/jxhe/unify-parameter-efficient-tuning">Code</a></p> </blockquote> <...

Untitled

2026-02-04 AINotes/14.PEFT/00.Survey

<h2 id="0-摘要">0 摘要<a class="anchor-link" href="#0-摘要" title="Permanent link">¶</a></h2> <p>随着基于 Transformer 的预训练语言模...

Untitled

2026-02-04 AINotes/14.PEFT/03.PromptTuning

<h2 id="面向预训练语言模型的-prompt-tuning-技术发展历程">面向预训练语言模型的 Prompt-Tuning 技术发展历程<a class="anchor-link" href="#面向预训练语言模型的-prompt-tuning-技术发展历程" title=...

Untitled

2026-02-04 AINotes/14.PEFT/05.Adapter

<blockquote> <p><a href="https://zhuanlan.zhihu.com/p/718589490">R-Adapter：零样本模型微调新突破，提升鲁棒性与泛化能力 | ECCV 2024</a></p> <...

Untitled

2026-02-04 AINotes/14.PEFT/05.Adapter

<h2 id="parameter-efficient-transfer-learning-for-nlp-adaptericml-2019">Parameter-Efficient Transfer Learning for NLP Adapter（ICML 2019）<a cl...

Untitled

2026-02-04 AINotes/02.卷积神经网络

<blockquote> <p>Convolutional Neural Networks</p> </blockquote> <h2 id="11-为什么cnn">1.1 为什么CNN<a class="anchor-link" h...

Untitled

2026-02-04 AINotes/44.LTCIL

<h2 id="adaptive-adapter-routing-for-long-tailed-class-incremental-learning">Adaptive Adapter Routing for Long-Tailed Class-Incremental Learning...

Untitled

2026-02-04 AINotes/44.LTCIL/00.Survey

<h2 id="cvpr2024-gr">CVPR2024 GR<a class="anchor-link" href="#cvpr2024-gr" title="Permanent link">¶</a></h2> <blockquo...

Untitled

2026-02-04 AINotes/22.Mamba

<p>论文标题：VMamba: Visual State Space Model</p> <p>论文地址: https://arxiv.org/abs/2401.10166</p> <p>代码地址: https://github.com/M...

Untitled

2026-02-04 AINotes/22.Mamba

<h2 id="背景">背景<a class="anchor-link" href="#背景" title="Permanent link">¶</a></h2> <p>Transformer:以其注意力机制而闻名，其中序列的任何...

Untitled

2026-02-04 AINotes/22.Mamba

<table> <thead> <tr> <th>符号</th> <th>维度</th> <th>符号说明</th> <th>默认值</th> </tr> ...

Untitled

2026-02-04 AINotes/22.Mamba

<h2 id="selection-mechanism">Selection Mechanism<a class="anchor-link" href="#selection-mechanism" title="Permanent link">¶</a>...

Untitled

2026-02-04 AINotes/22.Mamba

<p>论文地址：https://arxiv.org/pdf/2401.09417.pdf<br /> 项目地址：https://github.com/hustvl/Vim<br /> 论文标题：Vision Mamba: Efficient Visual Repr...

load pre-trained image processor for efficientnet-b7 and model weight

2026-02-04 AINotes/07.计算机视觉

<blockquote> <p><a href="https://mp.weixin.qq.com/s/mvozZ_iIRtFgmHUoAne80Q">图像相似性搜索比较：EfficientNet vs. ViT vs. DINO-v2 vs. CLIP vs. ...

Untitled

2026-02-04 AINotes/07.计算机视觉/01DINO系列讲解

...

Untitled

2026-02-04 AINotes/05.VisionTransformer

<h2 id="0-摘要">0. 摘要<a class="anchor-link" href="#0-摘要" title="Permanent link">¶</a></h2> <p>参数效率微调（PEFT）已成为适应预训练ViT...

Untitled

2026-02-04 AINotes/05.VisionTransformer

<h2 id="an-image-is-worth-16x16-words-transformers-for-image-recognition-at-scale">An Image is Worth 16x16 Words: Transformers for Image Recogni...

ViT 微调时position embedding如何插值（interpolate）【源码解析】

2026-02-04 AINotes/05.VisionTransformer

<h1 id="vit-微调时position-embedding如何插值interpolate源码解析">ViT 微调时position embedding如何插值（interpolate）【源码解析】<a class="anchor-link" href="#vit-微调时po...

Untitled

2026-02-04 AINotes/05.VisionTransformer

<h2 id="peeling-back-the-layers-interpreting-the-storytelling-of-vit"><a href="https://dl.acm.org/doi/10.1145/3664647.3681712">Peeling Bac...

Untitled

2026-02-04 AINotes/05.VisionTransformer

<h2 id="dynamic-tuning-towards-parameter-and-inference-efficiency-for-vit-adaptation"><a href="http://arxiv.org/abs/2403.11808">Dynamic Tu...

Untitled

2026-02-04 AINotes/05.VisionTransformer

<h2 id="vision-transformer-详解">Vision Transformer 详解<a class="anchor-link" href="#vision-transformer-详解" title="Permanent link">¶<...

Untitled

2026-02-04 AINotes/05.VisionTransformer

<h2 id="vision-transformer-to-discover-the-four-secrets-of-image-patches"><a href="https://linkinghub.elsevier.com/retrieve/pii/S156625352400...

Untitled

2026-02-04 AINotes/05.VisionTransformer

<h2 id="符号定义">符号定义<a class="anchor-link" href="#符号定义" title="Permanent link">¶</a></h2> <p>在论文的 Table1 中有给出三个模型（Bas...

Untitled

2026-02-04 AINotes/04.Transformer

<h3 id="self-attention-vs-cnn3">Self-attention v.s. CNN3<a class="anchor-link" href="#self-attention-vs-cnn3" title="Permanent link">¶...

Untitled

2026-02-04 AINotes/04.Transformer

<p><a href="https://zhuanlan.zhihu.com/p/338817680">Transformer模型详解（图解最完整版）</a></p> <h2 id="前言">前言<a class="anchor-li...

Untitled

2026-02-04 AINotes/04.Transformer

<blockquote> <p><a href="https://hw-universal.oss-cn-beijing.aliyuncs.com/self_v7.pptx">self_v7.pptx</a></p> </blockq...

Bert_李宏毅

2026-02-04 AINotes/04.Transformer

<h2 id="seq2seq">Seq2Seq<a class="anchor-link" href="#seq2seq" title="Permanent link">¶</a></h2> <p>在开始讲解Attention之...

Untitled

2026-02-04 AINotes/04.Transformer

<h2 id="transformer-与-ffn">Transformer 与 FFN<a class="anchor-link" href="#transformer-与-ffn" title="Permanent link">¶</a></h...

Untitled

2026-02-04 AINotes/04.Transformer

<h2 id="a-high-level-look">A High-Level Look<a class="anchor-link" href="#a-high-level-look" title="Permanent link">¶</a></h...

Untitled

2026-02-04 AINotes/04.Transformer

<h2 id="multi-head-self-attention">Multi-head Self-attention<a class="anchor-link" href="#multi-head-self-attention" title="Permanent link"&g...

Untitled

2026-02-04 AINotes/04.Transformer

<h2 id="多层transformer">多层Transformer<a class="anchor-link" href="#多层transformer" title="Permanent link">¶</a></h2> <p&...

Untitled

2026-02-04 AINotes/04.Transformer

<blockquote> <p>文章来源：<a href="https://www.zhihu.com/question/592626839/answer/3304714001">为什么Self-Attention要通过线性变换计算Q K V，背后的原理或直观解释...

Untitled

2026-02-04 AINotes/04.Transformer

<h2 id="transformer">Transformer<a class="anchor-link" href="#transformer" title="Permanent link">¶</a></h2> <blockquo...

Untitled

2026-02-04 AINotes/04.Transformer

<h2 id="理论推导">理论推导<a class="anchor-link" href="#理论推导" title="Permanent link">¶</a></h2> <p>Self-Attention的Input，是一串...

Untitled

2026-02-04 AINotes/04.Transformer

<h2 id="1-前言">1. 前言<a class="anchor-link" href="#1-前言" title="Permanent link">¶</a></h2> <p>最近，OpenAI推出的ChatGPT展现出了...

变分自编码器VAE：原来是这么一回事 | 附开源代码

2026-02-04 AINotes/04.Transformer

<blockquote> <p>文章来源：<a href="https://zhuanlan.zhihu.com/p/348498294">机器学习方法—优雅的模型（一）：变分自编码器（VAE）</a></p> </blockquot...

Untitled

2026-02-04 AINotes/04.Transformer

<h2 id="applications-">Applications …<a class="anchor-link" href="#applications-" title="Permanent link">¶</a></h2> <p...

Untitled

2026-02-04 AINotes/999.深度学习调参指南

<h3 id="选择优化器">选择优化器<a class="anchor-link" href="#选择优化器" title="Permanent link">¶</a></h3> <p><strong><em&...

Untitled

2026-02-04 AINotes/999.深度学习调参指南

<h3 id="选择模型架构">选择模型架构<a class="anchor-link" href="#选择模型架构" title="Permanent link">¶</a></h3> <p><strong><...

Untitled

2026-02-04 AINotes/999.深度学习调参指南

<ul> <li>预训练参数是最好的参数初始化方式，其次是Xavir。</li> </ul>...

Untitled

2026-02-04 AINotes/999.深度学习调参指南

<ul> <li>ReLu、Sigmoid、Softmax、Tanh是最常用的4个激活函数。</li> <li>对于输出层，常用sigmoid和softMax激活函数，中间层常用ReLu激活函数，RNN常用Tanh激活函数。</li> &l...

Untitled

2026-02-04 AINotes/999.深度学习调参指南

<ul> <li>学习率最好是从高到低2倍速度递减一般从0.01开始。</li> <li>如果使用微调，则learning rate设置为0.0001较好。learning rate设置上有很多trick，包括cosing learning rate等...

Untitled

2026-02-04 AINotes/999.深度学习调参指南

<ul> <li>Epoch number和Early stopping是息息相关的，需要输出loss看一下，到底是什么epoch时效果最好，及时early stopping。</li> <li>Epoch越大，会浪费计算资源；epoch太小，则训练模...

Untitled

2026-02-04 AINotes/999.深度学习调参指南

<blockquote> <ul> <li><a href="https://github.com/schrodingercatss/tuning_playbook_zh_cn">tuning_playbook_zh_cn</a></...

Untitled

2026-02-04 AINotes/999.深度学习调参指南

<ul> <li>Focal loss对于极大不平衡的数据集确实有奇效，其中gamma因子可以成10倍数衰减</li> <li>Loss function是Model和数据之外，第三重要的参数。具体使用MSE、Cross entropy、Focal还是...

Untitled

2026-02-04 AINotes/999.深度学习调参指南

<ul> <li>batch size不能太大，也不能太小；太小会浪费计算资源，太大则会浪费内存；一般设置为16的倍数。对于推荐来说32-64-128-512测试效果再高一般也不会正向了，再低训练太慢了。</li> <li>Learning rate和...

Untitled

2026-02-04 AINotes/999.深度学习调参指南

<ul> <li>数据量太大的情况下，可以先用1/10，1/100的数据先去估算一下训练或者推理时间，心里有个底。</li> <li>视觉问题一定要使用数据增强。</li> <li>一定要进行数据预处理，把数据分布分散到均值为0...

Untitled

2026-02-04 AINotes/01.MLTutorials/08.模型优化

<h2 id="批量大小对梯度下降法的影响">批量大小对梯度下降法的影响<a class="anchor-link" href="#批量大小对梯度下降法的影响" title="Permanent link">¶</a></h2> <p&...

AI | 算法工程师必备的深度学习--最优化（上）

2026-02-04 AINotes/01.MLTutorials/08.模型优化

<p><img alt="cover_image" src="https://mmbiz.qlogo.cn/mmbiz_jpg/x3YYuUrJv099lPmzMAIv0VO5yKmukdjxibNYMM0Oiaa7rVlbf0KsQ3WtWn7LkMia5NfSZSnnNdG6H...

机器学习调参自动优化方法

2026-02-04 AINotes/01.MLTutorials/08.模型优化

<p><img alt="cover_image" src="https://mmbiz.qlogo.cn/mmbiz_jpg/vI9nYe94fsEdWl1RjERNWqia63EmoBmWJFgw9TUA0ibJm5hvHWMcHXm4YmAkBibr3yZX8b4RZic2V...

Untitled

2026-02-04 AINotes/01.MLTutorials/08.模型优化

<p><a href="https://zhuanlan.zhihu.com/p/343564175">论文阅读笔记：各种Optimizer梯度下降优化算法回顾和总结</a></p> <p>不管是使用PyTorch还是TensorFlow，...

Untitled

2026-02-04 AINotes/01.MLTutorials/08.模型优化

<blockquote> <p><a href="https://blog.csdn.net/zhaohongfei_358/article/details/129625803">权重衰减weight_decay参数从入门到精通</a></p&g...

PyTorch | 优化神经网络训练的17种方法

2026-02-04 AINotes/01.MLTutorials/08.模型优化

<h1 id="pytorch--优化神经网络训练的17种方法">PyTorch | 优化神经网络训练的17种方法<a class="anchor-link" href="#pytorch--优化神经网络训练的17种方法" title="Permanent link">&pa...

Untitled

2026-02-04 AINotes/01.MLTutorials/08.模型优化

<h2 id="todo">TODO<a class="anchor-link" href="#todo" title="Permanent link">¶</a></h2> <h2 id="gradient-descent--mome...

Untitled

2026-02-04 AINotes/01.MLTutorials/08.模型优化

<p>临界点其实不一定是在训练一个网络的时候会遇到的最大的障碍。图 3.18 中的横坐标代表参数更新的次数，竖坐标表示损失。一般在训练一个网络的时候，损失原来很大，随着参数不断的更新，损失会越来越小，最后就卡住了，损失不再下降。当我们走到临界点的时候，意味着梯度非常小，但损失不再下降的时...

Untitled

2026-02-04 AINotes/01.MLTutorials/08.模型优化

<p><strong>临界点</strong>：<br /> <strong>局部极小值</strong> ：<br /> <strong>鞍点</strong>：</p> <...

Untitled

2026-02-04 AINotes/01.MLTutorials/08.模型优化

<h2 id="梯度下降">梯度下降<a class="anchor-link" href="#梯度下降" title="Permanent link">¶</a></h2> <p>解决下面的最优化问题：<br /> ...

Untitled

2026-02-04 AINotes/01.MLTutorials/08.模型优化

<h2 id="一个框架回顾优化算法">一个框架回顾优化算法<a class="anchor-link" href="#一个框架回顾优化算法" title="Permanent link">¶</a></h2> <p>深度学习优化...

Untitled

2026-02-04 AINotes/01.MLTutorials/08.模型优化

<h2 id="背景">背景<a class="anchor-link" href="#背景" title="Permanent link">¶</a></h2> <h3 id="梯度下降">梯度下降<a class="an...

Untitled

2026-02-04 AINotes/01.MLTutorials/08.模型优化

<h2 id="dropout">Dropout<a class="anchor-link" href="#dropout" title="Permanent link">¶</a></h2> <h3 id="how-to-train"...