← 返回首页
宇宙湾
https://yuzhouwan.com/yuzhouwan_logo_32x32.ico
厚积薄发
2025-02-08T02:35:39.368Z
https://yuzhouwan.com/
Benedict Jin
Hexo
人工智能
https://yuzhouwan.com/posts/42737/
2017-05-16T10:58:02.000Z
2025-02-08T02:35:39.368Z
<h2 id="什么是人工智能"><a href="#什么是人工智能" class="headerlink" title="什么是人工智能"></a>什么是人工智能</h2><p> <strong>人工智能</strong>(<strong>A</strong>rtificial <strong>I</strong>ntelligence, <strong>AI</strong>)亦称<strong>机器智能</strong>,是指由人工制造出来的系统所表现出来的智能。 — <a href="https://zh.wikipedia.org/wiki/%E4%BA%BA%E5%B7%A5%E6%99%BA%E8%83%BD">wikipedia.org</a></p>
<p> 从深蓝到 <a href="https://arxiv.org/pdf/1712.01815">AlphaZero</a>,再到 <a href="https://arxiv.org/pdf/1812.04948">StyleGAN</a> 和 <a href="https://arxiv.org/pdf/2212.00857">GPT</a>,人工智能的智力水平、学习能力和普适性,正在以爆炸式地速度快速发展;<br> 从棋类到医学,再到绘画和聊天,人工智能开始在各类应用领域大展身手;<br> 从 CPU 到 GPU,再到 TPU 和 IPU,人工智能的计算能力正向着无法穷举的极限不断逼近 …</p>
<p> 但是,我们并不浮躁,踏踏实实地点亮 AI 知识树的每个枝叶,才是我们每位富有科学精神的人所应该做的</p>
<h2 id="关于本文"><a href="#关于本文" class="headerlink" title="关于本文"></a>关于本文</h2><p> 我们将分为三块对 AI 进行诠释</p>
<p> 首先,将介绍人工智能的<strong>主流思想</strong>和<strong>实用技巧</strong>,通过一些耳熟能详的<a href="https://yuzhouwan.com/posts/4534/">有趣定理</a>,我们可以对人工智能有些直观、初步的认识;随后,言归正传,我们将开始接触 AI 领域的几大<strong>理论支柱</strong>,由浅入深地学习 <a href="https://yuzhouwan.com/posts/42737/#统计学">统计学</a>、<a href="https://yuzhouwan.com/posts/42737/#微积分">微积分</a>、<a href="https://yuzhouwan.com/posts/42737/#线性代数">线性代数</a>、<a href="https://yuzhouwan.com/posts/42737/#概率论">概率论</a> 等知识体系;最后,落地到实践,我们需要紧跟人工智能的<strong>技术发展前沿</strong>,对重大的突破性项目进行了解、学习,以及运用。如此,对人工智能领域进行横向分层,可以很方便地找到我们学习的突破点</p>
<p> 不过,出于文章编排的考虑,可能部分编码就要放在其他博文中了,如有不便,还望见谅(<a href="https://yuzhouwan.com/posts/43687/#Python-第三方库">Python</a>、Prolog、R、<a href="https://yuzhouwan.com/posts/27328/">Java</a>)。本文持续更新中,若有不妥之处,还请不吝赐教哈 (^o^)/</p>
<h2 id="主流思想"><a href="#主流思想" class="headerlink" title="主流思想"></a>主流思想</h2><h3 id="演绎法-amp-溯因法-amp-归纳法"><a href="#演绎法-amp-溯因法-amp-归纳法" class="headerlink" title="演绎法 & 溯因法 & 归纳法"></a>演绎法 & 溯因法 & 归纳法</h3><p><img data-src="/picture/ai/ai_cause_rule_effect.png" alt=""></p>
<center>(利用 Axure™ 绘制而成)</center>
<h2 id="实用技巧"><a href="#实用技巧" class="headerlink" title="实用技巧"></a>实用技巧</h2><h3 id="Occam-剃刀原理"><a href="#Occam-剃刀原理" class="headerlink" title="Occam 剃刀原理"></a>Occam 剃刀原理</h3><p> <strong>奥卡姆剃刀</strong>(Occam´s Razor),意为<strong>简约之法</strong>,是由 14 世纪<a href="https://zh.wikipedia.org/wiki/%E9%80%BB%E8%BE%91%E5%AD%A6">逻辑学</a>家、<a href="https://zh.wikipedia.org/wiki/%E8%81%96%E6%96%B9%E6%BF%9F%E5%90%84%E6%9C%83">圣方济各会</a><a href="https://zh.wikipedia.org/wiki/%E4%BF%AE%E5%A3%AB">修士</a><a href="https://zh.wikipedia.org/wiki/%E5%A5%A5%E5%8D%A1%E5%A7%86%E7%9A%84%E5%A8%81%E5%BB%89">奥卡姆的威廉</a>提出的一个解决问题的法则,即<code>"切勿浪费较多资源,去做'用较少的资源,同样可以做好'的事情"</code>,相同思想见于郑板桥的<strong>删繁就简三秋树</strong></p>
Apache Druid:一款高效的 OLAP 引擎
https://yuzhouwan.com/posts/5845/
2017-04-02T03:05:11.000Z
2025-02-08T02:26:51.760Z
<h2 id="基本概念"><a href="#基本概念" class="headerlink" title="基本概念"></a>基本概念</h2><h3 id="概述"><a href="#概述" class="headerlink" title="概述"></a>概述</h3><p> <strong>Apache Druid</strong>™ 是目前非常流行的、高性能的、分布式列存储的 <strong>OLAP</strong> 引擎(准确来说是 <strong>MOLAP</strong>)。它是一款可以快速(实时)访问大量的、很少变化的数据的系统。并被设计为,在面对代码部署、机器故障和生产系统的其他可能性问题时,依旧能 100% 地正常提供服务</p>
<p><img data-src="/picture/druid/druid_pumpkin_compressed.png" alt="Apache Druid Pumpkin"></p>
<center>(图片来源:Vadim Ogievetsky 在万圣节的个人作品,已获得授权)</center>
<h3 id="特性"><a href="#特性" class="headerlink" title="特性"></a>特性</h3><h4 id="分析事件流"><a href="#分析事件流" class="headerlink" title="分析事件流"></a>分析事件流</h4><p> Druid 支持对 event-driven 数据进行快速地高并发查询。还可以实时地摄入流式数据,并提供亚秒级查询能力,以支持强大的 UI 交互</p>
<h4 id="创新的架构设计"><a href="#创新的架构设计" class="headerlink" title="创新的架构设计"></a>创新的架构设计</h4><p> Druid 是一种新型数据库,它结合了 OLAP 分析数据库、时间序列数据库 和 全文检索 的思想,以支持流式体系架构下的大部分应用场景</p>
<h4 id="构建事件驱动的数据栈"><a href="#构建事件驱动的数据栈" class="headerlink" title="构建事件驱动的数据栈"></a>构建事件驱动的数据栈</h4><p> Druid 天然集成了消息队列(如 Kafka、AWS Kinesis 等)和数据湖(如 HDFS、AWS S3 等),使得其非常适用于流式总线和流处理器的查询层</p>
<h4 id="解锁新的工作流"><a href="#解锁新的工作流" class="headerlink" title="解锁新的工作流"></a>解锁新的工作流</h4><p> Druid 旨在对实时数据和历史数据进行快速地即时分析。使用可快速更替的查询,进行趋势解释,数据探索,以响应各种分析诉求</p>
<h4 id="多环境部署"><a href="#多环境部署" class="headerlink" title="多环境部署"></a>多环境部署</h4><p> Druid 可以部署在任何的 <code>*NIX</code> 商用硬件上,无论是在云端还是内部部署。Druid 是 Cloud Native 的,这意味着集群扩容和缩容,就像添加和删除进程一样简单</p>
<h4 id="多数据源摄入"><a href="#多数据源摄入" class="headerlink" title="多数据源摄入"></a>多数据源摄入</h4><p> Druid 支持将多种外部数据系统作为数据源,进行数据摄入,包括 <a href="https://yuzhouwan.com/tags/Apache-Hadoop/">Hadoop</a>、<a href="https://yuzhouwan.com/posts/4735/">Spark</a>、<a href="https://yuzhouwan.com/tags/Apache-Storm/">Storm</a> 和 <a href="https://yuzhouwan.com/posts/26002/">Kafka</a> 等</p>
<h4 id="多版本并发控制"><a href="#多版本并发控制" class="headerlink" title="多版本并发控制"></a>多版本并发控制</h4><p> 多版本并发控制(<strong>MVCC</strong>,<strong>M</strong>ulti-<strong>V</strong>ersion <strong>C</strong>oncurrent <strong>C</strong>ontrol),主要是为了解决多用户操作同一条记录时的并发问题。MVCC 设计思路是,在并发访问数据库时,不使用粗暴的行锁,而是在事务型操作更新数据时,生成一个新版本的数据。如此,可以保证读写分离,避免了读写操作互相阻塞,以提高并发性能。另外,约束任意时刻只有最新版本的记录是有效的,即也保证了数据的一致性</p>
<p> 而 Druid 中是使用数据更新时间来区分版本,历史节点只加载最新版本的数据。同时,<strong>实时数据索引</strong>与<strong>离线数据批量覆盖</strong>同时进行的 Lambda 架构设计,既满足了实时响应的需求,又确保了数据的准确性</p>
<h4 id="易于运维"><a href="#易于运维" class="headerlink" title="易于运维"></a>易于运维</h4><p> Druid 集群可以做到 Self-healing 和 Self-balancing。如果 Druid 服务器发生故障,系统将会自动绕过损坏的路由,直到这些机器恢复或被替换掉。在扩缩容集群的时候,只需要增加或下线服务器,集群本身会在后台自动 re-balance。Druid 在设计上保证了可以全天候工作,不会因为任何原因而停机,包括配置更改和集群升级</p>
Python:从入门到实践
https://yuzhouwan.com/posts/43687/
2015-02-05T16:08:40.000Z
2025-02-08T02:26:46.849Z
本文主要介绍 Python 的概念与特性、环境部署、基本语法、标准库、第三方库、科学分析工具、Python 工程工具、实战技巧和踩过的一些坑。
ZooKeeper 原理与优化
https://yuzhouwan.com/posts/31915/
2017-04-22T00:38:04.000Z
2025-02-08T02:26:44.379Z
<h2 id="ZooKeeper-是什么?"><a href="#ZooKeeper-是什么?" class="headerlink" title="ZooKeeper 是什么?"></a>ZooKeeper 是什么?</h2><p> <strong>ZooKeeper</strong> 是一个基于 <a href="https://static.googleusercontent.com/media/research.google.com/zh-CN//archive/chubby-osdi06.pdf">Google Chubby</a> 论文实现的一款解决分布式数据一致性问题的开源实现,方便了依赖 ZooKeeper 的应用实现 <code>数据发布 / 订阅</code>、<code>负载均衡</code>、<code>服务注册与发现</code>、<code>分布式协调</code>、<code>事件通知</code>、<code>集群管理</code>、<code>Leader 选举</code>、 <code>分布式锁和队列</code> 等功能</p>
<h2 id="基本概念"><a href="#基本概念" class="headerlink" title="基本概念"></a>基本概念</h2><h3 id="集群角色"><a href="#集群角色" class="headerlink" title="集群角色"></a>集群角色</h3><p> 一般的,在分布式系统中,构成集群的每一台机器都有自己的角色,最为典型的集群模式就是 <code>Master / Slave</code> 主备模式。在该模式中,我们把能够处理所有<code>写操作</code>的机器称为 <code>Master</code> 节点,并把所有通过<code>异步复制</code>方式获取最新数据、提供<code>读服务</code>的机器称为 <code>Slave</code> 节点</p>
<p><img data-src="/picture/zk/zk_master_slave.png" alt=""></p>
<center>(利用 Axure™ 绘制而成)</center>
<p> 而 ZooKeeper 中,则是引入了 <code>领导者(Leader)</code>、<code>跟随者(Follower)</code>、<code>观察者(Observer)</code> 三种角色 和 <code>领导(Leading)</code>、<code>跟随(Following)</code>、<code>观察(Observing)</code>、<code>寻找(Looking)</code> 等相应的状态。在 ZooKeeper 集群中的通过一种 <code>Leader 选举</code>的过程,来选定某个节点作为 <code>Leader</code> 节点,该节点为客户端提供<code>读</code>和<code>写</code>服务。而 <code>Follower</code> 和 <code>Observer</code> 节点,则都能提供<code>读</code>服务,唯一的区别在于,<code>Observer</code> 机器<code>不参与 Leader 选举</code>过程 和 <code>写操作</code>的<code>"过半写成功"</code>策略,<code>Observer</code> 只会被告知已经 commit 的 proposal。因此 <code>Observer</code> 可以在<code>不影响写性能</code>的情况下提升集群的<code>读性能</code>(详见下文 “性能优化 - 优化策略 - Observer 模式” 部分)</p>
<p><img data-src="/picture/zk/zk_leader_follower_observer.png" alt=""></p>
<center>(利用 Axure™ 绘制而成)</center>
程序员的 Mac 高效手册
https://yuzhouwan.com/posts/190101/
2019-01-01T11:58:02.000Z
2025-02-08T02:26:10.593Z
如何将 Mac 这个生产工具的效能发挥到极致呢(如何省出一个小长假)?本文将从 Mac 的基础环境配置、Java、Maven、高级命令、工具、快捷键和预先整理相关资源等方面,来阐述如何提升工作效率的。
Kubernetes 实战
https://yuzhouwan.com/posts/200919/
2020-09-19T00:25:16.000Z
2025-02-08T02:26:10.592Z
Here's something encrypted, password is required to continue reading.
Git 高级玩法
https://yuzhouwan.com/posts/30041/
2017-04-11T15:36:16.000Z
2025-02-08T02:26:10.591Z
Git 相关的使用技巧、常见的坑和 Github 社区中高频的缩写。
架构方法论大合集
https://yuzhouwan.com/posts/210313/
2021-03-13T14:02:55.000Z
2024-06-01T06:01:04.266Z
Here's something encrypted, password is required to continue reading.
Algorithm
https://yuzhouwan.com/posts/666/
2019-05-01T13:38:08.000Z
2024-05-12T06:42:34.873Z
<h2 id="LeetCode-组队刷题活动"><a href="#LeetCode-组队刷题活动" class="headerlink" title="LeetCode 组队刷题活动"></a>LeetCode 组队刷题活动</h2><div class="note primary">组队刷 LeetCode</div>
<h3 id="介绍"><a href="#介绍" class="headerlink" title="介绍"></a>介绍</h3><h4 id="代码仓库"><a href="#代码仓库" class="headerlink" title="代码仓库"></a>代码仓库</h4><p> 代码仓库的坐标:<strong><a href="https://github.com/asdf2014/algorithm">asdf2014 / algorithm</a></strong></p>
<h4 id="报名途径"><a href="#报名途径" class="headerlink" title="报名途径"></a>报名途径</h4><p> 只需要在《<a href="https://yuzhouwan.com/posts/666/">Algorithm</a>》文末的评论区,或者在 <a href="https://github.com/asdf2014/gitment/issues/40">issues#40</a> 中留言,即可随时参与</p>
<div class="note success">留言内容的话,可以是任意的。另外,也可以说明下自己能接受的刷题频率、希望的选题策略,亦或者,对算法知识沉淀的模式有好的建议,都可以提出,不胜感激</div>
<h4 id="参与方式"><a href="#参与方式" class="headerlink" title="参与方式"></a>参与方式</h4><p> 每位参与的小伙伴,都会获得代码仓库的 <a href="https://help.github.com/en/github/setting-up-and-managing-your-github-user-account/permission-levels-for-a-user-account-repository">Collaborator</a> 权限,可以自由地提交代码(不限制语种)。在 <code>/Codes/${你的 Github 账号名}</code> 目录下,每人都将拥有一个自己的代码库。留下 Github 名称后,将很快会收到邀请函,大家可以在 <a href="https://github.com/asdf2014/algorithm/invitations">asdf2014 - algorithm - invitations</a> 链接中认领(当然,也欢迎直接通过提交 Pull Request 参与进来)。随后,可以在任意目录下(不需要是空目录),使用如下命令,一键完成你的第一次代码提交:</p>
<figure class="highlight bash"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">bash -c <span class="string">"<span class="subst">$(curl -L https://raw.githubusercontent.com/asdf2014/algorithm/master/first_commit.sh)</span>"</span></span><br></pre></td></tr></tbody></table></figure>
<h4 id="刷题频率"><a href="#刷题频率" class="headerlink" title="刷题频率"></a>刷题频率</h4><p> 考虑到可能大家的闲暇时间并不多,我们暂定刷题频率为“一周一题”</p>
<h4 id="选题策略"><a href="#选题策略" class="headerlink" title="选题策略"></a>选题策略</h4><p> <a href="https://github.com/asdf2014/algorithm/blob/master/Picker/random_picker.py">选题机器人</a>会在每周五晚八点,自动地随机选定一个题目,当前题目点击<a href="https://github.com/asdf2014/algorithm#%E9%80%89%E9%A2%98%E7%AD%96%E7%95%A5">这里</a>查看。</p>
<h3 id="其他"><a href="#其他" class="headerlink" title="其他"></a>其他</h3><p> 操作 Git 时遇到问题的话,可以参考我的一篇博客《<a href="https://yuzhouwan.com/posts/30041/">Git 高级玩法</a>》</p>
<div class="note success">也可以直接在文章最后留言。目前,支持 Gitalk + Disqus 两种留言系统,以便更好地服务于国内和海外的小伙伴</div>
<p> 同时,为了大家更加方便地交流,也欢迎加入算法 QQ 群 <a href="https://shang.qq.com/wpa/qunwpa?idkey=bfbcf1453371a0810fd6be235ace47147f6fb9d262fb768b497c861f50af0af4"><img data-src="/picture/algorithm/algorithm_qq_group_5366753.svg" alt=""></a> 或者 Gitter 聊天室 <a href="https://gitter.im/yuzhouwan/community?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge"><img data-src="/picture/algorithm/algorithm_gitter_community.svg" alt=""></a></p>
<div class="note danger">但是,请不要在评论区讨论入群问题的答案,避免打广告的进入</div>
<p> 另外,因为大部分算法都会有很多实现思路,我们会尽可能地展现所有可能的解题方法。但为了文章的排版更加地紧凑,我们会将同一算法的不同实现,通过选项卡的形式展现。且默认展示的选项卡将会是最优解。这样的话,如果你想要快速阅读本文,则可以不用翻看其他的选项卡。实际效果如下:</p>
<div class="tabs" id="code"><ul class="nav-tabs"><li class="tab"><a href="#code-1">CODE 1</a></li><li class="tab"><a href="#code-2">CODE 2</a></li><li class="tab active"><a href="#code-3">CODE 3</a></li></ul><div class="tab-content"><div class="tab-pane" id="code-1"><p><strong>迭代解</strong></p>
<figure class="highlight python"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">solution</span>(<span class="params">n</span>):</span><br><span class="line"> <span class="keyword">if</span> n <= <span class="number">1</span>:</span><br><span class="line"> <span class="keyword">return</span> n</span><br><span class="line"> a = <span class="number">0</span></span><br><span class="line"> b = <span class="number">1</span></span><br><span class="line"> <span class="keyword">while</span> n > <span class="number">1</span>:</span><br><span class="line"> n = n - <span class="number">1</span></span><br><span class="line"> sum_ = a + b</span><br><span class="line"> a = b</span><br><span class="line"> b = sum_</span><br><span class="line"> <span class="keyword">return</span> b</span><br></pre></td></tr></tbody></table></figure></div><div class="tab-pane" id="code-2"><p><strong>递归解</strong></p>
<figure class="highlight python"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">solution</span>(<span class="params">n</span>):</span><br><span class="line"> <span class="keyword">if</span> n <= <span class="number">1</span>:</span><br><span class="line"> <span class="keyword">return</span> n</span><br><span class="line"> <span class="keyword">else</span>:</span><br><span class="line"> <span class="keyword">return</span> solution(n - <span class="number">1</span>) + solution(n - <span class="number">2</span>)</span><br></pre></td></tr></tbody></table></figure></div><div class="tab-pane active" id="code-3"><p><strong>动态规划解</strong></p>
<figure class="highlight python"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">solution</span>(<span class="params">n</span>):</span><br><span class="line"> <span class="keyword">if</span> n <= <span class="number">1</span>:</span><br><span class="line"> <span class="keyword">return</span> n</span><br><span class="line"> cache = [x <span class="keyword">for</span> x <span class="keyword">in</span> <span class="built_in">range</span>(<span class="number">0</span>, n + <span class="number">1</span>)]</span><br><span class="line"> cache[<span class="number">1</span>] = <span class="number">1</span></span><br><span class="line"> <span class="keyword">for</span> i <span class="keyword">in</span> <span class="built_in">range</span>(<span class="number">2</span>, n + <span class="number">1</span>):</span><br><span class="line"> cache[i] = cache[i - <span class="number">1</span>] + cache[i - <span class="number">2</span>]</span><br><span class="line"> <span class="keyword">return</span> cache[n]</span><br></pre></td></tr></tbody></table></figure></div></div></div>
搜索引擎 Elasticsearch
https://yuzhouwan.com/posts/22654/
2017-04-02T04:18:42.000Z
2024-05-12T06:42:29.576Z
<h2 id="Elasticsearch-是什么?"><a href="#Elasticsearch-是什么?" class="headerlink" title="Elasticsearch 是什么?"></a>Elasticsearch 是什么?</h2><p> <a href="https://yuzhouwan.com/posts/22654/"><strong>Elasticsearch</strong></a>™ 是一款基于 Lucene 的搜索引擎,不但稳定、可靠、快速,同时具备良好的水平扩展能力</p>
<h2 id="特性"><a href="#特性" class="headerlink" title="特性"></a>特性</h2><ul>
<li>功能丰富,且开箱即用</li>
<li>横向可扩展性</li>
<li>分片机制更好地解决热点问题</li>
<li>多副本有效保证了高可用</li>
<li>精确的熔断器机制</li>
<li>社区庞大,生态完善</li>
</ul>
<h2 id="主要概念"><a href="#主要概念" class="headerlink" title="主要概念"></a>主要概念</h2><h3 id="Cluster-集群"><a href="#Cluster-集群" class="headerlink" title="Cluster 集群"></a>Cluster 集群</h3><p> 在一个分布式系统里面,可以通过多个 Elasticsearch 节点组成一个<strong>集群</strong>。集群中会动态选举出一个主节点,保证了 Elasticsearch 集群不存在单点故障<br> 在同一子网内,只需要将进程设置为相同的集群名,Elasticsearch 就会把这些集群名相同的进程自动组成一个集群。集群中各节点间的通讯和数据负载均衡,全部都由 Elasticsearch 自动管理</p>
<h3 id="Node-节点"><a href="#Node-节点" class="headerlink" title="Node 节点"></a>Node 节点</h3><p> 每一个 Elasticsearch 进程称为一个 <strong>Node 节点</strong>。在测试环境中,可以在一台服务器上运行多个 Elasticsearch 进程;但生产环境中,则建议每台服务器只运行一个 Elasticsearch 进程</p>
<h3 id="Index-索引"><a href="#Index-索引" class="headerlink" title="Index 索引"></a>Index 索引</h3><p> Elasticsearch 中的索引是文档数据存储的地方,相当于是传统关系数据库中的 DataBase 概念。更多逻辑上的对应关系,如下表所示:</p>
<div class="table-container">
<table>
<thead>
<tr>
<th style="text-align:center">Relational DB</th>
<th style="text-align:center">HBase</th>
<th style="text-align:center">Elasticsearch</th>
<th style="text-align:center">说明</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:center">Database</td>
<td style="text-align:center">NameSpace</td>
<td style="text-align:center">Template</td>
<td style="text-align:center">一组索引的模板配置</td>
</tr>
<tr>
<td style="text-align:center">Table</td>
<td style="text-align:center">Table</td>
<td style="text-align:center">Index</td>
<td style="text-align:center">索引</td>
</tr>
<tr>
<td style="text-align:center">Row</td>
<td style="text-align:center">RowKey</td>
<td style="text-align:center">Document</td>
<td style="text-align:center">文档,和 Lucene 概念一致</td>
</tr>
<tr>
<td style="text-align:center">Column + Value</td>
<td style="text-align:center">Cell</td>
<td style="text-align:center">Field</td>
<td style="text-align:center">如果将文档理解为 JSON,那么 Field 就是字段和值</td>
</tr>
<tr>
<td style="text-align:center">-</td>
<td style="text-align:center">-</td>
<td style="text-align:center">Term</td>
<td style="text-align:center">检索的基本单位,相当于是文本中的一个词</td>
</tr>
<tr>
<td style="text-align:center">-</td>
<td style="text-align:center">-</td>
<td style="text-align:center">Token</td>
<td style="text-align:center">Term 内容、类型,以及 Term 在文本中的起始及偏移</td>
</tr>
</tbody>
</table>
</div>
<div class="note info">目前最新的 Elasticsearch 7.x 版本里面已经废弃了 Type 的概念</div>
领导力方法论
https://yuzhouwan.com/posts/231209/
2023-12-09T12:36:12.000Z
2024-05-12T06:41:36.771Z
Here's something encrypted, password is required to continue reading.
逻辑学精粹
https://yuzhouwan.com/posts/240127/
2024-01-27T04:10:06.000Z
2024-05-12T06:41:36.770Z
Here's something encrypted, password is required to continue reading.
有趣的数学
https://yuzhouwan.com/posts/4534/
2017-11-23T03:12:59.000Z
2024-05-12T06:41:36.768Z
Here's something encrypted, password is required to continue reading.
一门让你觉得离散数据没白学的语言:TLA+
https://yuzhouwan.com/posts/200725/
2020-07-24T23:56:02.000Z
2024-05-12T06:41:36.765Z
Here's something encrypted, password is required to continue reading.
Redis 实战
https://yuzhouwan.com/posts/2129/
2017-07-19T13:16:08.000Z
2024-05-12T06:41:36.764Z
本文主要介绍了 Redis 的环境搭建、实战技巧、技术内幕和 Jedis 客户端相关内容。
Qcon 2015 见闻之一:猿题库
https://yuzhouwan.com/posts/17444/
2015-04-25T14:57:15.000Z
2024-05-12T06:41:36.762Z
记录了 2015 届 QCon 大会的其中一场。
Apache Kafka 分布式消息队列框架
https://yuzhouwan.com/posts/26002/
2015-05-10T04:43:42.000Z
2024-05-12T06:41:36.758Z
<h2 id="Kafka-是什么?"><a href="#Kafka-是什么?" class="headerlink" title="Kafka 是什么?"></a>Kafka 是什么?</h2><p> <strong>Kafka</strong> is a distributed, partitioned, replicated commit log service. It provides the functionality of a messaging system, but with a unique design.</p>
<h2 id="为什么要有-Kafka"><a href="#为什么要有-Kafka" class="headerlink" title="为什么要有 Kafka?"></a>为什么要有 Kafka?</h2><h3 id="分布式"><a href="#分布式" class="headerlink" title="分布式"></a>分布式</h3><p> 具备经济、快速、可靠、易扩充、数据共享、设备共享、通讯方便、灵活等,分布式所具备的特性</p>
<h3 id="高吞吐量"><a href="#高吞吐量" class="headerlink" title="高吞吐量"></a>高吞吐量</h3><p> 同时为数据生产者和消费者提高吞吐量</p>
<h3 id="高可靠性"><a href="#高可靠性" class="headerlink" title="高可靠性"></a>高可靠性</h3><p> 支持多个消费者,当某个消费者失败的时候,能够自动负载均衡</p>
<h3 id="离线-amp-实时性"><a href="#离线-amp-实时性" class="headerlink" title="离线 & 实时性"></a>离线 & 实时性</h3><p> 能将消息持久化,进行批量处理</p>
<h3 id="解耦"><a href="#解耦" class="headerlink" title="解耦"></a>解耦</h3><p> 作为各个系统连接的桥梁,避免系统之间的耦合</p>
Helm 实战
https://yuzhouwan.com/posts/200926/
2020-09-26T07:08:00.000Z
2024-04-13T11:19:05.897Z
Here's something encrypted, password is required to continue reading.
Linux 实战技巧
https://yuzhouwan.com/posts/15691/
2017-04-07T12:30:58.000Z
2023-12-30T05:30:04.581Z
介绍 Linux 相关的命令、Shell 编程、实用技巧、优化实战和系统架构相关知识。
如何运用 JVM 知识提高编程水平
https://yuzhouwan.com/posts/27328/
2016-07-16T09:38:16.000Z
2023-12-30T05:28:49.381Z
<h2 id="什么是-JVM"><a href="#什么是-JVM" class="headerlink" title="什么是 JVM?"></a>什么是 JVM?</h2><p> A <strong>J</strong>ava <strong>V</strong>irtual <strong>M</strong>achine(JVM)is an abstract computing machine that enables a computer to run a Java program.</p>
<h2 id="为什么要有-JVM"><a href="#为什么要有-JVM" class="headerlink" title="为什么要有 JVM?"></a>为什么要有 JVM?</h2><h3 id="跨平台性"><a href="#跨平台性" class="headerlink" title="跨平台性"></a>跨平台性</h3><p> JVM 的存在,使得 Java 程序 能够轻易地在多平台上移植,基本上脱离了对硬件的依赖性(这也满足了 <a href="https://en.wikipedia.org/wiki/David_Parnas">David Parnas</a> 的 “<a href="https://en.wikipedia.org/wiki/Information_hiding">信息隐藏</a>” 准则)</p>
<h3 id="多语言性"><a href="#多语言性" class="headerlink" title="多语言性"></a>多语言性</h3><p> 因为底层 JIT 编译优化、高效 GC、JUC 对多线程并发编程的支持,以及社区中海量成熟的库 等优点,使得<a href="https://en.wikipedia.org/wiki/List_of_JVM_languages">很多语言</a>都开发出可运行在 JVM 上的版本</p>
<p> 同时,多语言混合编程成为一种趋势,在需要快速开发、灵活部署 和 针对特定问题的 DSL 等场景下,选择恰当的 JVM-hosted language,可以最大化原有代码的价值</p>
<p> <strong>那么,在日常的开发过程中,究竟应该如何运用 JVM 的知识,来逐步提高实际编程水平呢? 上下而求索后,找到了以下几个层面作为出发点</strong></p>
大数据生态圈里的一致性算法
https://yuzhouwan.com/posts/54206/
2016-10-11T03:48:41.000Z
2023-12-30T05:28:49.380Z
<h2 id="大数据生态圈中,保证一致性的方式举不胜举"><a href="#大数据生态圈中,保证一致性的方式举不胜举" class="headerlink" title="大数据生态圈中,保证一致性的方式举不胜举"></a>大数据生态圈中,保证一致性的方式举不胜举</h2><ul>
<li><a href="https://yuzhouwan.com/posts/60504/">Hadoop</a> 用 <a href="https://yuzhouwan.com/posts/31915/">ZooKeeper</a>(Zab,即支持事务顺序的 Paxos)</li>
<li><a href="https://yuzhouwan.com/posts/22654/">Elasticsearch</a> 用 <a href="https://yuzhouwan.com/posts/31130/">Hash</a> 路由算法(而非一致性 Hash)</li>
<li><a href="https://yuzhouwan.com/posts/20644/#Elasticsearch-Connector">Cassandra</a> 用 Gossip 闲话算法</li>
<li><a href="https://yuzhouwan.com/posts/2129/">Redis</a> 用 <a href="https://yuzhouwan.com/posts/31915/#Raft">Raft</a> 选举算法</li>
</ul>
<p>他们各有什么区别,为什么会如此选型?</p>
<h3 id="Paxos-选举算法"><a href="#Paxos-选举算法" class="headerlink" title="Paxos 选举算法"></a>Paxos 选举算法</h3><p> <strong>Paxos</strong> 是最先解决<strong>拜占庭将军问题</strong>的算法,利用<strong>过半选举</strong>的机制,保证了集群数据副本的一致性(微服务中<a href="https://yuzhouwan.com/posts/31915/#其他技术比对">服务注册与发现</a>的场景,其实已经不再适用了)</p>
<h3 id="Raft-选举算法"><a href="#Raft-选举算法" class="headerlink" title="Raft 选举算法"></a>Raft 选举算法</h3><p> Redis 使用 <strong>Raft</strong> 实现了自己的分布式一致性。Raft 本身和 Paxos 并没有场景上的区别。更多的是,协议上的简化、Term 概念的强化、Log 只会从 Leader 到 Follower 单向同步,使得实现起来会很方便</p>
<h3 id="Zab-原子广播协议"><a href="#Zab-原子广播协议" class="headerlink" title="Zab 原子广播协议"></a>Zab 原子广播协议</h3><p> Hadoop 偏向于离线的海量数据处理,利用 <a href="https://yuzhouwan.com/posts/31915/">ZooKeeper</a> 来保证<a href="https://yuzhouwan.com/posts/31915/#Paxos-的强一致性">数据副本的一致性</a>,是最为合适的</p>
<h3 id="Hash-路由算法"><a href="#Hash-路由算法" class="headerlink" title="Hash 路由算法"></a>Hash 路由算法</h3><p> <a href="https://yuzhouwan.com/posts/22654/">Elasticsearch</a> 集群接收到为文档创建索引的请求时,需要选择在哪一个 shard(完整且独立的 Lucene 索引实例)上对文档进行索引。Elasticsearch 采用的是 <a href="http://www.cse.yorku.ca/~oz/hash.html">djb2</a> 哈希算法(俗称 <a href="https://azrael.digipen.edu/~mmead/www/Courses/CS280/HashFunctions-1.html">times33</a>),对要索引文档默认或指定的 key 进行哈希 <code>hash(key)</code>,然后再对 Elasticsearch 集群中 shard 的数量 n 进行取模,即 $hash(key) \, mod \, n$</p>
<h3 id="一致性-Hash"><a href="#一致性-Hash" class="headerlink" title="一致性 Hash"></a>一致性 Hash</h3><p> 用于对数据存储进行<strong>负载均衡</strong>的算法。最新的进展,是在去年 Google 发表的一篇 <a href="https://arxiv.org/abs/1608.01350">有界负载的一致性 Hash 算法</a>的论文。该算法保证了负载均衡<strong>一致性</strong>和<strong>稳定性</strong>的同时,在<strong>均匀性</strong>方面做出了实质性地改进。同时,Consistent Hashing with Bounded Loads 算法 也在 <a href="http://www.haproxy.org/">HaProxy</a> 开源项目中得以<a href="https://github.com/haproxy/haproxy/blob/master/src/lb_chash.c#L244">应用</a>,有效减少了其 8 倍的缓存带宽</p>
<h3 id="Gossip-闲话算法"><a href="#Gossip-闲话算法" class="headerlink" title="Gossip 闲话算法"></a>Gossip 闲话算法</h3><p> <strong>Gossip</strong> 主要被 Cassandra 用于实现其分布式一致性。因为 Cassandra 框架,更看重 <strong>去中心化</strong> 和 <strong>容错</strong> 的特性,在不违背 CAP 定理的情况下,能够接受 最终一致性</p>
Serverless 详解
https://yuzhouwan.com/posts/201001/
2020-10-01T10:52:02.000Z
2023-12-30T05:28:49.377Z
<h2 id="Serverless-是什么?"><a href="#Serverless-是什么?" class="headerlink" title="Serverless 是什么?"></a>Serverless 是什么?</h2><blockquote>
<p>Serverless computing is a cloud computing execution model in which the cloud provider runs the server, and dynamically manages the allocation of machine resources. Pricing is based on the actual amount of resources consumed by an application, rather than on pre-purchased units of capacity. It can be a form of utility computing. — <a href="https://en.wikipedia.org/wiki/Serverless_computing">wikipedia.org</a></p>
<p>Serverless architectures are application designs that incorporate third-party “Backend as a Service” (BaaS) services, and/or that include custom code run in managed, ephemeral containers on a “Functions as a Service” (FaaS) platform. — <a href="https://martinfowler.com/articles/serverless.html">《Serverless Architectures》</a></p>
<p>无服务器架构是基于互联网的系统,其中应用开发不使用常规的服务进程。相反,它们仅依赖于第三方服务(例如 AWS Lambda 服务),客户端逻辑和服务托管远程过程调用的组合。 — <a href="https://aws.amazon.com/cn/blogs/china/iaas-faas-serverless/">亚马逊 AWS 官方博客</a></p>
<p>Serverless(无服务器架构)是指服务端逻辑由开发者实现,运行在无状态的计算容器中,由事件触发,完全被第三方管理,其业务层面的状态则存储在数据库或其他介质中。 — <a href="https://www.bookstack.cn/read/serverless-handbook/concepts-what-is-serverless.md">《无服务架构实践手册》</a></p>
<p>If your PaaS can efficiently start instances in 20ms that run for half a second, then call it serverless. — Adrian Cockroft</p>
</blockquote>
<h2 id="优缺点"><a href="#优缺点" class="headerlink" title="优缺点"></a>优缺点</h2><h3 id="优势"><a href="#优势" class="headerlink" title="优势"></a>优势</h3><h4 id="低成本"><a href="#低成本" class="headerlink" title="低成本"></a>低成本</h4><h5 id="运维成本"><a href="#运维成本" class="headerlink" title="运维成本"></a>运维成本</h5><p> 服务器、中间件、数据库等均托管于 BaaS/FaaS 平台,用户无需再参与基础设施及软件的维护,省去了集群的运维成本</p>
<h5 id="开发成本"><a href="#开发成本" class="headerlink" title="开发成本"></a>开发成本</h5><p> 对比 IaaS 或者 PaaS 平台的服务器或者操作系统,Serverless 的架构中,用户操作的是服务化的组件,比如存储服务、授权服务等,可以缩短开发周期,节约时间成本</p>
<h4 id="按需计费"><a href="#按需计费" class="headerlink" title="按需计费"></a>按需计费</h4><p> Serverless/FaaS 区别于 IaaS/PaaS 预先分配计算资源的计费方式,其计费方式通常是按请求次数及运行时间。如此一来,不仅可以最大程度地利用资源,还能实现真正的按需计费,以降低用户的使用成本</p>
<p><img data-src="/picture/serverless/serverless_cost.png" alt="Serverless cost"></p>
<center>(使用 <a href="https://www.apple.com/cn/ipad/" target="_blank">iPad</a>™ 手绘而成)</center>
<h4 id="高扩展"><a href="#高扩展" class="headerlink" title="高扩展"></a>高扩展</h4><p> 自动进行横向扩展(毫秒级部署,秒级生命周期)</p>
<h4 id="高资源利用率"><a href="#高资源利用率" class="headerlink" title="高资源利用率"></a>高资源利用率</h4><p> 提供细粒度的计算能力,最大限度满足实时需求,使得资源利用率大幅度提升</p>
<h4 id="NoOps"><a href="#NoOps" class="headerlink" title="NoOps"></a>NoOps</h4><p> 运维的发展经历了,人肉运维、自动化运维、DevOps、AiOps 等。而 Serverless 模式下,用户只需要关心业务编码,真正实现了零运维成本</p>
<div class="note info">从更广泛的意义上来讲,Ops 除了指服务器维护,还会包括部署、网络、安全、监控、故障恢复和水平扩展等</div>
感谢张主任的赠书,预祝新书大卖
https://yuzhouwan.com/posts/231217/
2023-12-17T05:06:18.000Z
2023-12-17T05:13:48.351Z
<p><img data-src="/picture/spark/pyspark_book_1.jpg" alt=""></p>
<p><img data-src="/picture/spark/pyspark_book_2.jpg" alt=""></p>
<p>《PySpark 大数据分析实战》兼备全面的理论知识,更具众多的实战案例,着实是一本不可多得的佳作。作者结合多年实战经验,从基础概念出发,循序渐进至高级应用,完整展示了 PySpark 的知识体系。无论你是大数据的初学者,还是经验丰富的领域专家,此书都能为你提供实用的参考和专业的指导。</p>
那些绕不过去的 Java 知识点
https://yuzhouwan.com/posts/190413/
2019-04-13T07:54:52.000Z
2023-12-10T05:06:17.597Z
<h2 id="关于本文"><a href="#关于本文" class="headerlink" title="关于本文"></a>关于本文</h2><p> 虽然接触 Java 已经十余年之久,可惜学习之初的笔记文档没能很好地保存下来。本文是近几年工作学习中遇到的一些零散的知识点,包括了 基础概念、实用的编程技巧、代码可读性、设计模式、性能优化(工具 & 编码)、测试相关、JVM 相关、常用的工具和常见问题。本着好记性不如烂笔头的初衷,在不断地踩坑和爬坑的过程中,慢慢地记录成文。期待着本文能起到抛砖引玉的作用,以看到大家的真知灼见。</p>
<h2 id="基础知识"><a href="#基础知识" class="headerlink" title="基础知识"></a>基础知识</h2><h3 id="注解"><a href="#注解" class="headerlink" title="注解"></a>注解</h3><h4 id="GuardedBy"><a href="#GuardedBy" class="headerlink" title="GuardedBy"></a>GuardedBy</h4><p> <code>@GuardedBy</code> 注解可以作用于某一个属性或者方法,约定在访问这些被注解标记的资源时,能被同步代码块保护着。简单的使用案例如下:</p>
<figure class="highlight java"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">@GuardedBy("obj")</span></span><br><span class="line"><span class="keyword">private</span> ConcurrentMap<String, String> map = <span class="keyword">new</span> <span class="title class_">ConcurrentHashMap</span><>();</span><br><span class="line"><span class="keyword">private</span> <span class="keyword">final</span> <span class="type">Object</span> <span class="variable">obj</span> <span class="operator">=</span> <span class="keyword">new</span> <span class="title class_">Object</span>();</span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title function_">put</span><span class="params">(String k, String v)</span> {</span><br><span class="line"> <span class="keyword">synchronized</span> (obj) {</span><br><span class="line"> map.put(k, v);</span><br><span class="line"> }</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * If you use `error prone` tool to check this, this annotation should be `<span class="doctag">@SuppressWarnings</span>("GuardedBy")`</span></span><br><span class="line"><span class="comment"> * {<span class="doctag">@see</span> https://errorprone.info/bugpattern/GuardedBy}</span></span><br><span class="line"><span class="comment"> * {<span class="doctag">@see</span> https://github.com/apache/druid/pull/6868#discussion_r249639199}</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="meta">@SuppressWarnings("FieldAccessNotGuarded")</span></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title function_">remove</span><span class="params">(String k)</span> {</span><br><span class="line"> map.remove(k);</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line"><span class="meta">@Override</span></span><br><span class="line"><span class="keyword">public</span> String <span class="title function_">toString</span><span class="params">()</span> {</span><br><span class="line"> <span class="keyword">synchronized</span> (obj) {</span><br><span class="line"> <span class="keyword">return</span> <span class="string">"GuardedByExample{"</span> +</span><br><span class="line"> <span class="string">"map="</span> + map +</span><br><span class="line"> <span class="string">'}'</span>;</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></tbody></table></figure>
<p>Tips: <a href="https://github.com/apache/druid/pull/6903">Code Example</a> from <a href="https://yuzhouwan.com/posts/5845/">Apache Druid</a>;另外,<strong>error-prone</strong> 工具支持对<a href="https://github.com/google/error-prone/blob/master/docs/bugpattern/GuardedBy.md#guardedby">多种版本</a>的 <code>@GuardedBy</code> 进行检查</p>
如何成为 Apache 的 PMC
https://yuzhouwan.com/posts/19631/
2017-04-03T07:46:21.000Z
2023-09-30T10:24:08.686Z
<h2 id="关于本文"><a href="#关于本文" class="headerlink" title="关于本文"></a>关于本文</h2><p> 本文主要是为了,记录给 <a href="https://yuzhouwan.com/posts/5845/"><code>Apache Druid</code></a> / <a href="https://yuzhouwan.com/posts/39683/"><code>Apache Eagle</code></a> / <a href="https://yuzhouwan.com/posts/20644/"><code>Apache Flink</code></a> / <a href="https://yuzhouwan.com/posts/45888/"><code>Apache HBase</code></a> / <a href="https://yuzhouwan.com/posts/26002/"><code>Apache Kafka</code></a> / <a href="https://yuzhouwan.com/posts/743/"><code>Apache Superset</code></a> / <a href="https://yuzhouwan.com/posts/31915/"><code>Apache ZooKeeper</code></a> & <a href="https://yuzhouwan.com/posts/31915/"><code>Apache Curator</code></a> / <a href="https://yuzhouwan.com/posts/42737/"><code>TensorFlow</code></a> / <a href="https://github.com/alibaba/DataX"><code>Alibaba DataX</code></a> <a href="https://yuzhouwan.com/posts/19631/#其他">等</a>开源项目贡献代码,尽自己一点绵薄之力的过程</p>
<p> 文章最后,总结了一些经验之谈,期冀能帮助到同样<strong>热爱开源</strong>、也想成为 <a href="http://people.apache.org/committer-index.html#asdf2014">PMC</a> 的小伙伴们</p>
Apache Druid 云原生架构演进
https://yuzhouwan.com/posts/220820/
2022-08-19T23:16:56.000Z
2023-09-30T10:22:27.316Z
<p>有幸作为讲师参与了 ApacheCon 2022 大会,活动已经圆满结束,万分感谢主办方的邀请和筹划!</p>
<p>我的演讲主题是<strong>《Apache Druid 云原生架构演进》</strong>。</p>
<ul>
<li><a href="https://apachecon.com/acasia2022/zh/sessions/bigdata-1033.html">讲师介绍</a></li>
<li><a href="https://www.bilibili.com/video/BV1Tg411C79y">视频回放</a></li>
<li><a href="https://mp.weixin.qq.com/s/iXWjV6m4oK1C04bja3ySXw">逐字稿</a></li>
</ul>
<p>以上是相关材料,欢迎大家自取 <span class="null"><span>😄</span><img src="https://github.githubassets.com/images/icons/emoji/unicode/1f604.png?v8" aria-hidden="true" onerror="this.parent.classList.add('null-fallback')"></span></p>
<p>期间,收到了主办方、观众和读者的一致好评。同时,还结识了一波大佬,这也是我本次最大的收获,感谢!</p>
<p><img data-src="/picture/apachecon/apache_druid_cloud_native_architecture_evolution_comment_1.png" alt=""></p>
<p><img data-src="/picture/apachecon/apache_druid_cloud_native_architecture_evolution_comment_2.png" alt=""></p>
<p><img data-src="/picture/apachecon/apache_druid_cloud_native_architecture_evolution_comment_3.png" alt=""></p>
<p>以下,则是本次演讲详细剖析的三个核心问题:</p>
<ol>
<li>我们为什么要演进到云原生架构?</li>
<li>如果要实现云原生化,那我们又要做哪些事情呢?</li>
<li>而在这个过程中,可能还会踩到哪些坑呢?</li>
</ol>
<p>感谢你们的关注与支持!!!</p>
Benedict Jin's Blog
https://yuzhouwan.com/posts/18517/
2014-11-01T00:10:59.000Z
2023-09-30T10:17:47.915Z
<h2 id="Welcome"><a href="#Welcome" class="headerlink" title="Welcome"></a>Welcome</h2><p> <strong>Welcome to My Blog!</strong></p>
<h2 id="博客介绍"><a href="#博客介绍" class="headerlink" title="博客介绍"></a>博客介绍</h2><p> 吾生有涯而学无涯,以有涯而逐无涯(有点断章取义,不过追寻知识的热情是必要的)</p>
<h2 id="大事件纪实"><a href="#大事件纪实" class="headerlink" title="大事件纪实"></a>大事件纪实</h2><meta charset="utf-8"><meta content="width=device-width, initial-scale=1.0" name="viewport"><title>宇宙湾 - 大事件纪实</title><style>html, body { margin: 0; padding: 0; font-family: Helvetica, sans-serif;}section#timeline { width: 80%; margin: 20px auto; position: relative; padding: 0px 5px 1px 5px; background: #25303B;}section#timeline:before { content: ''; display: block; position: absolute; left: 50%; top: 0; margin: 0 0 0 -1px; width: 2px; height: 100%; background: rgba(255,255,255,0.2);}section#timeline article { width: 100%; margin: 0 0 20px 0; position: relative;}section#timeline article:after { content: ''; display: block; clear: both;}section#timeline article div.inner { width: 40%; float: left; margin: 5px 0 0 0; border-radius: 6px;}section#timeline article div.inner span.date { display: block; width: 60px; height: 50px; padding: 5px 0; position: absolute; top: 0; left: 50%; margin: 0 0 0 -32px; border-radius: 100%; font-size: 12px; font-weight: 900; text-transform: uppercase; background: #25303B; color: rgba(255,255,255,0.5); border: 2px solid rgba(255,255,255,0.2); box-shadow: 0 0 0 7px #25303B;}section#timeline article div.inner span.date span { display: block; text-align: center;}section#timeline article div.inner span.date span.day { font-size: 10px;}section#timeline article div.inner span.date span.month { font-size: 18px;}section#timeline article div.inner span.date span.year { font-size: 10px;}section#timeline article div.inner h2 { padding: 15px; margin: 0; color: #fff; font-size: 20px; text-transform: uppercase; letter-spacing: -1px; border-radius: 6px 6px 0 0; position: relative;}section#timeline article div.inner h2:after { content: ''; position: absolute; top: 20px; right: -5px; width: 10px; height: 10px; -webkit-transform: rotate(-45deg);}section#timeline article div.inner p { padding: 15px; margin: 0; font-size: 14px; background: #fff; color: #656565; border-radius: 0 0 6px 6px;}section#timeline article:nth-child(2n+2) div.inner { float: right;}section#timeline article:nth-child(2n+2) div.inner h2:after { left: -5px;}section#timeline article:nth-child(1) div.inner h2 { background: #725f60;}section#timeline article:nth-child(1) div.inner h2:after { background: #725f60;}section#timeline article:nth-child(2) div.inner h2 { background: #b782ab;}section#timeline article:nth-child(2) div.inner h2:after { background: #b782ab;}section#timeline article:nth-child(3) div.inner h2 { background: #829ab5;}section#timeline article:nth-child(3) div.inner h2:after { background: #829ab5;}section#timeline article:nth-child(4) div.inner h2 { background: #a66a48;}section#timeline article:nth-child(4) div.inner h2:after { background: #a66a48;}section#timeline article:nth-child(5) div.inner h2 { background: #4295b7;}section#timeline article:nth-child(5) div.inner h2:after { background: #4295b7;}section#timeline article:nth-child(6) div.inner h2 { background: #81b85d;}section#timeline article:nth-child(6) div.inner h2:after { background: #81b85d;}section#timeline article:nth-child(7) div.inner h2 { background: #8c7c75;}section#timeline article:nth-child(7) div.inner h2:after { background: #8c7c75;}section#timeline article:nth-child(8) div.inner h2 { background: #5a963d;}section#timeline article:nth-child(8) div.inner h2:after { background: #5a963d;}section#timeline article:nth-child(9) div.inner h2 { background: #ba8036;}section#timeline article:nth-child(9) div.inner h2:after { background: #ba8036;}section#timeline article:nth-child(10) div.inner h2 { background: #6a7576;}section#timeline article:nth-child(10) div.inner h2:after { background: #6a7576;}section#timeline article:nth-child(11) div.inner h2 { background: #9a9643;}section#timeline article:nth-child(11) div.inner h2:after { background: #9a9643;}section#timeline article:nth-child(12) div.inner h2 { background: #959bc8;}section#timeline article:nth-child(12) div.inner h2:after { background: #959bc8;}section#timeline article:nth-child(13) div.inner h2 { background: #a3bf8b;}section#timeline article:nth-child(13) div.inner h2:after { background: #a3bf8b;}section#timeline article:nth-child(14) div.inner h2 { background: #667678;}section#timeline article:nth-child(14) div.inner h2:after { background: #667678;}section#timeline article:nth-child(15) div.inner h2 { background: #917d86;}section#timeline article:nth-child(15) div.inner h2:after { background: #917d86;}section#timeline article:nth-child(16) div.inner h2 { background: #c3bb47;}section#timeline article:nth-child(16) div.inner h2:after { background: #c3bb47;}section#timeline article:nth-child(17) div.inner h2 { background: #5ba5c3;}section#timeline article:nth-child(17) div.inner h2:after { background: #5ba5c3;}section#timeline article:nth-child(18) div.inner h2 { background: #a78554;}section#timeline article:nth-child(18) div.inner h2:after { background: #a78554;}section#timeline article:nth-child(19) div.inner h2 { background: #b2bb93;}section#timeline article:nth-child(19) div.inner h2:after { background: #b2bb93;}section#timeline article:nth-child(20) div.inner h2 { background: #9c648a;}section#timeline article:nth-child(20) div.inner h2:after { background: #9c648a;}</style><section id="timeline"><article><div class="inner"><span class="date"><span class="year">2014</span><span class="month">Nov</span><span class="day">01<sup>st</sup></span></span><h2>混沌初开</h2><p>建站第一天</p></div></article><article><div class="inner"><span class="date"><span class="year">2015</span><span class="month">Jan</span><span class="day">01<sup>st</sup></span></span><h2>模糊的记忆</h2><p>Hexo 框架 / next 主题 / 七牛图床 / Gulp 压缩 / 静态资源 CDN / 支持 MathJax</p></div></article><article><div class="inner"><span class="date"><span class="year">2017</span><span class="month">Apr</span><span class="day">10<sup>th</sup></span></span><h2>多说关闭</h2><p>评论系统切换为 Disqus</p></div></article><article><div class="inner"><span class="date"><span class="year">2017</span><span class="month">Apr</span><span class="day">22<sup>nd</sup></span></span><h2>Order by Update</h2><p>文章以<b>最后更新时间</b>倒排展示(避免养成隔一段时间水一篇的坏习惯)</p></div></article><article><div class="inner"><span class="date"><span class="year">2017</span><span class="month">May</span><span class="day">25<sup>th</sup></span></span><h2>Aliyun 备案</h2><p>苏 ICP</p></div></article><article><div class="inner"><span class="date"><span class="year">2017</span><span class="month">Oct</span><span class="day">10<sup>th</sup></span></span><h2>全站 HTTPS</h2><p>TrustAsia 域名证书</p></div></article><article><div class="inner"><span class="date"><span class="year">2017</span><span class="month">Nov</span><span class="day">15<sup>th</sup></span></span><h2>Coding.net</h2><p>静态页面从 github.io 切换为 coding.net(香港服务器)</p></div></article><article><div class="inner"><span class="date"><span class="year">2017</span><span class="month">Nov</span><span class="day">19<sup>th</sup></span></span><h2>不蒜子 502</h2><p>页面统计切换为 Lean Cloud,之前的 PV / UV 统计无奈清零</p></div></article><article><div class="inner"><span class="date"><span class="year">2017</span><span class="month">Nov</span><span class="day">20<sup>th</sup></span></span><h2>DDoS 攻击解除</h2><p>回归不蒜子</p></div></article><article><div class="inner"><span class="date"><span class="year">2018</span><span class="month">May</span><span class="day">29<sup>th</sup></span></span><h2>Gitment</h2><p>延迟加载 Gitment</p></div></article><article><div class="inner"><span class="date"><span class="year">2019</span><span class="month">Apr</span><span class="day">20<sup>th</sup></span></span><h2>回归 Github Page</h2><p>Github Page 开始支持 HTTPS</p></div></article><article><div class="inner"><span class="date"><span class="year">2019</span><span class="month">Apr</span><span class="day">21<sup>st</sup></span></span><h2>全站 CDN</h2><p>阿里云 DCDN</p></div></article><article><div class="inner"><span class="date"><span class="year">2019</span><span class="month">Apr</span><span class="day">27<sup>th</sup></span></span><h2>简繁切换</h2><p>支持简体与繁体切换</p></div></article><article><div class="inner"><span class="date"><span class="year">2019</span><span class="month">May</span><span class="day">01<sup>st</sup></span></span><h2>支持 Gitalk</h2><p>Gitment 验证存在跨域问题,而 Gitalk 可以无缝迁移</p></div></article><article><div class="inner"><span class="date"><span class="year">2019</span><span class="month">May</span><span class="day">02<sup>nd</sup></span></span><h2>支持 DaoVoice</h2><p>可以匿名留言,在线沟通</p></div></article><article><div class="inner"><span class="date"><span class="year">2019</span><span class="month">May</span><span class="day">11<sup>th</sup></span></span><h2>暂闭 DaoVoice</h2><p>出于其服务稳定性的考量,暂时关闭</p></div></article><article><div class="inner"><span class="date"><span class="year">2019</span><span class="month">May</span><span class="day">11<sup>th</sup></span></span><h2>设计 Logo</h2><p>新 Logo 寓意着浩瀚宇宙中的一处安心的港湾</p></div></article><article><div class="inner"><span class="date"><span class="year">2020</span><span class="month">Jan</span><span class="day">01<sup>st</sup></span></span><h2>源站迁移</h2><p>全站迁移至阿里云 OSS,代替 Github Page 作为源站</p></div></article><article><div class="inner"><span class="date"><span class="year">2020</span><span class="month">Feb</span><span class="day">09<sup>th</sup></span></span><h2>镜像网站</h2><p>搭建镜像网站 yuzhouwan.github.io</p></div></article><article><div class="inner"><span class="date"><span class="year">2023</span><span class="month">Mar</span><span class="day">18<sup>th</sup></span></span><h2>GPT 加持</h2><p>使用 GPT-4 模型进行网站优化</p></div></article></section>
Apache Flink
https://yuzhouwan.com/posts/20644/
2017-06-06T12:19:28.000Z
2023-09-30T10:17:47.913Z
<h2 id="什么是-Flink?"><a href="#什么是-Flink?" class="headerlink" title="什么是 Flink?"></a>什么是 Flink?</h2><blockquote>
<p><strong><a href="https://flink.apache.org/">Apache Flink</a></strong>™ is a framework and distributed processing engine for stateful computations over <em>unbounded and bounded</em> data streams. Flink has been designed to run in <em>all common cluster environments</em>, perform computations at <em>in-memory speed</em> and at <em>any scale</em>.</p>
</blockquote>
<h2 id="Flink-架构"><a href="#Flink-架构" class="headerlink" title="Flink 架构"></a>Flink 架构</h2><h3 id="核心组件布局"><a href="#核心组件布局" class="headerlink" title="核心组件布局"></a>核心组件布局</h3><p><img data-src="/picture/flink/apache_flink_stack.png" alt="Apache Flink Stack"></p>
<center>(图片来源:<a href="https://flink.apache.org/" target="_blank">Apache Flink</a>™ 官网)</center>
Apache Eagle 深度调研
https://yuzhouwan.com/posts/39683/
2017-04-05T13:38:04.000Z
2023-09-30T10:17:47.912Z
<h2 id="基本概念"><a href="#基本概念" class="headerlink" title="基本概念"></a>基本概念</h2><p> <a href="https://yuzhouwan.com/posts/39683/">Apache Eagle</a> 是一个<code>高度可扩展</code>的监控警报平台,采用了<code>设计灵活</code>的应用框架和<code>经过实践考验</code>的大数据技术,如 <a href="https://yuzhouwan.com/posts/26002/">Kafka</a>,<a href="https://yuzhouwan.com/posts/4735/">Spark</a> 和 <a href="https://yuzhouwan.com/posts/13977/">Storm</a>。它提供了丰富的大数据平台监控程序,例如 <a href="https://yuzhouwan.com/posts/60504/">HDFS</a> / <a href="https://yuzhouwan.com/posts/45888/">HBase</a> / YARN 服务<code>运行状况检查</code>,<code>JMX 指标</code>,<code>守护进程日志</code>,<code>审核日志</code> 和 <code>Yarn</code> 应用程序。外部 Eagle 开发人员可以<code>自定义应用</code>来监视其 NoSQL 数据库或 Web 服务器,可以自己决定是否共享到 <code>Eagle 应用程序存储库</code>。它还提供最先进的<code>警报引擎</code>来报告<code>安全漏洞</code>,<code>服务故障</code>和<code>应用程序异常</code>,由警报策略定义<code>高度可定制</code>。</p>
<h3 id="Site"><a href="#Site" class="headerlink" title="Site"></a>Site</h3><p> 管理一组<code>应用程序</code>实例,用来区别某些被多次安装的应用程序</p>
<h3 id="Application"><a href="#Application" class="headerlink" title="Application"></a>Application</h3><p> 应用程序(或监控应用程序)是 Apache Eagle 中的一级公民,它代表<code>端到端</code>的<code>监控</code> / <code>警报</code>解决方案,通常包含<code>监控源</code>入站,源的 <code>schema</code>规范,<code>警报策略</code>和 <code>仪表板定义</code></p>
<h3 id="Stream"><a href="#Stream" class="headerlink" title="Stream"></a>Stream</h3><p> Stream 是 Alert Engine 的输入,每个<code>应用程序</code>应该有自己的由开发人员定义的流。通常,流定义里面包含了一个类似 <code>POJO</code> 的结构。一旦定义完成,<code>应用程序</code>就有了将数据写入<code>Kafka</code> 的逻辑</p>
梳理微积分知识体系
https://yuzhouwan.com/posts/200726/
2020-07-25T16:03:38.000Z
2023-09-23T10:37:06.029Z
Here's something encrypted, password is required to continue reading.
离散数学拾遗
https://yuzhouwan.com/posts/200307/
2020-03-07T00:16:31.000Z
2023-09-23T10:37:06.029Z
Here's something encrypted, password is required to continue reading.
Apache OpenWhisk:一款高性能的开源 Serverless 云平台
https://yuzhouwan.com/posts/201008/
2020-10-08T12:18:06.000Z
2023-09-23T10:37:06.022Z
Here's something encrypted, password is required to continue reading.
容器引擎 Docker
https://yuzhouwan.com/posts/200314/
2020-03-14T15:45:56.000Z
2023-08-19T08:11:53.595Z
<h2 id="什么是-Docker?"><a href="#什么是-Docker?" class="headerlink" title="什么是 Docker?"></a>什么是 Docker?</h2><blockquote>
<p><a href="https://docs.docker.com/">Docker</a>™ provides a way to run applications securely isolated in a container, packaged with all its dependencies and libraries.</p>
</blockquote>
<h2 id="环境搭建"><a href="#环境搭建" class="headerlink" title="环境搭建"></a>环境搭建</h2><h3 id="下载"><a href="#下载" class="headerlink" title="下载"></a>下载</h3><h4 id="CentOS"><a href="#CentOS" class="headerlink" title="CentOS"></a>CentOS</h4><figure class="highlight bash"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">$ curl -fsSL https://get.docker.com | bash -s docker --mirror Aliyun</span><br><span class="line">$ systemctl start docker</span><br><span class="line">$ systemctl status docker</span><br></pre></td></tr></tbody></table></figure>
<figure class="highlight bash"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line">● docker.service - Docker Application Container Engine</span><br><span class="line"> Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)</span><br><span class="line"> Active: active (running) since Sat 2023-03-25 14:22:34 CST; 18s ago</span><br><span class="line"> Docs: https://docs.docker.com</span><br><span class="line"> Main PID: 1971 (dockerd)</span><br><span class="line"> Tasks: 7</span><br><span class="line"> Memory: 108.2M</span><br><span class="line"> CGroup: /system.slice/docker.service</span><br><span class="line"> └─1971 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock</span><br><span class="line"></span><br><span class="line">Mar 25 14:22:33 iZt4n6q3i85nj90kbsfqz5Z dockerd[1971]: time=<span class="string">"2023-03-25T14:22:33.594962610+08:00"</span> level=info msg=<span class="string">"Loading containers: start."</span></span><br><span class="line">Mar 25 14:22:33 iZt4n6q3i85nj90kbsfqz5Z dockerd[1971]: time=<span class="string">"2023-03-25T14:22:33.907783608+08:00"</span> level=info msg=<span class="string">"Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be us...red IP address"</span></span><br><span class="line">Mar 25 14:22:33 iZt4n6q3i85nj90kbsfqz5Z dockerd[1971]: time=<span class="string">"2023-03-25T14:22:33.998627114+08:00"</span> level=info msg=<span class="string">"Loading containers: done."</span></span><br><span class="line">Mar 25 14:22:34 iZt4n6q3i85nj90kbsfqz5Z dockerd[1971]: time=<span class="string">"2023-03-25T14:22:34.077496453+08:00"</span> level=warning msg=<span class="string">"WARNING: bridge-nf-call-iptables is disabled"</span></span><br><span class="line">Mar 25 14:22:34 iZt4n6q3i85nj90kbsfqz5Z dockerd[1971]: time=<span class="string">"2023-03-25T14:22:34.077526272+08:00"</span> level=warning msg=<span class="string">"WARNING: bridge-nf-call-ip6tables is disabled"</span></span><br><span class="line">Mar 25 14:22:34 iZt4n6q3i85nj90kbsfqz5Z dockerd[1971]: time=<span class="string">"2023-03-25T14:22:34.077573836+08:00"</span> level=info msg=<span class="string">"Docker daemon"</span> commit=bc3805a graphdriver=overlay2 version=23.0.1</span><br><span class="line">Mar 25 14:22:34 iZt4n6q3i85nj90kbsfqz5Z dockerd[1971]: time=<span class="string">"2023-03-25T14:22:34.077678364+08:00"</span> level=info msg=<span class="string">"Daemon has completed initialization"</span></span><br><span class="line">Mar 25 14:22:34 iZt4n6q3i85nj90kbsfqz5Z systemd[1]: Started Docker Application Container Engine.</span><br><span class="line">Mar 25 14:22:34 iZt4n6q3i85nj90kbsfqz5Z dockerd[1971]: time=<span class="string">"2023-03-25T14:22:34.128839102+08:00"</span> level=info msg=<span class="string">"[core] [Server #7] Server created"</span> module=grpc</span><br><span class="line">Mar 25 14:22:34 iZt4n6q3i85nj90kbsfqz5Z dockerd[1971]: time=<span class="string">"2023-03-25T14:22:34.162907755+08:00"</span> level=info msg=<span class="string">"API listen on /run/docker.sock"</span></span><br><span class="line">Hint: Some lines were ellipsized, use -l to show <span class="keyword">in</span> full.</span><br></pre></td></tr></tbody></table></figure>
<figure class="highlight bash"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ docker version</span><br></pre></td></tr></tbody></table></figure>
<figure class="highlight bash"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line">Client: Docker Engine - Community</span><br><span class="line"> Version: 23.0.1</span><br><span class="line"> API version: 1.42</span><br><span class="line"> Go version: go1.19.5</span><br><span class="line"> Git commit: a5ee5b1</span><br><span class="line"> Built: Thu Feb 9 19:51:00 2023</span><br><span class="line"> OS/Arch: linux/amd64</span><br><span class="line"> Context: default</span><br><span class="line"></span><br><span class="line">Server: Docker Engine - Community</span><br><span class="line"> Engine:</span><br><span class="line"> Version: 23.0.1</span><br><span class="line"> API version: 1.42 (minimum version 1.12)</span><br><span class="line"> Go version: go1.19.5</span><br><span class="line"> Git commit: bc3805a</span><br><span class="line"> Built: Thu Feb 9 19:48:42 2023</span><br><span class="line"> OS/Arch: linux/amd64</span><br><span class="line"> Experimental: <span class="literal">false</span></span><br><span class="line"> containerd:</span><br><span class="line"> Version: 1.6.19</span><br><span class="line"> GitCommit: 1e1ea6e986c6c86565bc33d52e34b81b3e2bc71f</span><br><span class="line"> runc:</span><br><span class="line"> Version: 1.1.4</span><br><span class="line"> GitCommit: v1.1.4-0-g5fd4c4d</span><br><span class="line"> docker-init:</span><br><span class="line"> Version: 0.19.0</span><br><span class="line"> GitCommit: de40ad0</span><br></pre></td></tr></tbody></table></figure>
<h4 id="MacOS"><a href="#MacOS" class="headerlink" title="MacOS"></a>MacOS</h4><figure class="highlight bash"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># https://docs.docker.com/desktop/mac/install/</span></span><br><span class="line"><span class="comment"># 根据芯片类型,下载并安装</span></span><br></pre></td></tr></tbody></table></figure>
<h4 id="Windows"><a href="#Windows" class="headerlink" title="Windows"></a>Windows</h4><p> 从 Toolbox 的 Archive 页面找到 DockerToolbox-19.03.1.exe 并下载</p>
<div class="note warning">Toolbox 项目已经于 2021 年停止了维护</div>
<h3 id="安装"><a href="#安装" class="headerlink" title="安装"></a><a href="https://docs.docker.com/toolbox/overview/">安装</a></h3><p> 注意安装组件的时候,选择 <code>Full installation</code>,其他的均使用默认的选项,即可</p>
<h3 id="配置"><a href="#配置" class="headerlink" title="配置"></a>配置</h3><h4 id="代理"><a href="#代理" class="headerlink" title="代理"></a><a href="https://docs.docker.com/network/proxy/#configure-the-docker-client">代理</a></h4><figure class="highlight bash"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ vim ~/.docker/config.json</span><br></pre></td></tr></tbody></table></figure>
<figure class="highlight json"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">{</span></span><br><span class="line"> <span class="attr">"proxies"</span><span class="punctuation">:</span> <span class="punctuation">{</span></span><br><span class="line"> <span class="attr">"default"</span><span class="punctuation">:</span> <span class="punctuation">{</span></span><br><span class="line"> <span class="attr">"httpProxy"</span><span class="punctuation">:</span> <span class="string">"socks5://127.0.0.1:1080"</span><span class="punctuation">,</span></span><br><span class="line"> <span class="attr">"httpsProxy"</span><span class="punctuation">:</span> <span class="string">"socks5://127.0.0.1:1080"</span><span class="punctuation">,</span></span><br><span class="line"> <span class="attr">"noProxy"</span><span class="punctuation">:</span> <span class="string">"*.yuzhouwan.com"</span></span><br><span class="line"> <span class="punctuation">}</span></span><br><span class="line"> <span class="punctuation">}</span></span><br><span class="line"><span class="punctuation">}</span></span><br></pre></td></tr></tbody></table></figure>
Maven 高级玩法
https://yuzhouwan.com/posts/2254/
2017-04-10T15:11:13.000Z
2023-08-19T08:11:53.594Z
介绍 Maven 工具的实用技巧,以及如何解决一些常见的问题,包括本地缓存、下载、编译、语法、依赖和相关插件等问题。
Gradle 实战
https://yuzhouwan.com/posts/190816/
2019-08-15T17:16:02.000Z
2023-06-10T01:55:34.682Z
<h2 id="Gradle-是什么?"><a href="#Gradle-是什么?" class="headerlink" title="Gradle 是什么?"></a>Gradle 是什么?</h2><p> Gradle™ 是一个基于 Apache Ant 和 Apache <a href="https://yuzhouwan.com/posts/2254/">Maven</a> 概念的项目自动化建构工具。它使用一种基于 Groovy 的特定领域语言来声明项目设置,而不是传统的 XML。当前其支持的语言限于 <a href="https://yuzhouwan.com/posts/27328/">Java</a>、Groovy 和 <a href="https://yuzhouwan.com/posts/18651/">Scala</a>,计划未来将支持更多的语言。 — <a href="https://zh.wikipedia.org/wiki/Gradle">wikipedia.org</a></p>
<h2 id="特性"><a href="#特性" class="headerlink" title="特性"></a>特性</h2><ul>
<li>DSL 声明项目的配置,更加直观</li>
<li>细粒度的传递依赖管理</li>
<li>增量编译</li>
<li>高效的内存执行</li>
</ul>
Netty:从入门到实践
https://yuzhouwan.com/posts/200316/
2020-03-16T15:16:31.000Z
2023-02-18T14:10:01.531Z
<h2 id="Netty-是什么?"><a href="#Netty-是什么?" class="headerlink" title="Netty 是什么?"></a>Netty 是什么?</h2><blockquote>
<p><strong><a href="https://netty.io/">Netty</a></strong>™ is an asynchronous event-driven network application framework for rapid development of maintainable high performance protocol servers & clients.</p>
</blockquote>
<h2 id="基本概念"><a href="#基本概念" class="headerlink" title="基本概念"></a>基本概念</h2><h3 id="Channel"><a href="#Channel" class="headerlink" title="Channel"></a>Channel</h3><p> 代表一个到实体(硬件设备、文件、网络 Socket 等)的开放连接,如读操作或写操作</p>
<h3 id="Callback"><a href="#Callback" class="headerlink" title="Callback"></a>Callback</h3><p> 代表一个在处理完某个事件之后,被调用的方法</p>
<h3 id="Future"><a href="#Future" class="headerlink" title="Future"></a>Future</h3><p> 代表一个异步操作结果的占位符</p>
<h3 id="Event"><a href="#Event" class="headerlink" title="Event"></a>Event</h3><p> 代表一个可能会触发相应动作的事件连接被激活、用户事件等</p>
<h3 id="ChannelHandler"><a href="#ChannelHandler" class="headerlink" title="ChannelHandler"></a>ChannelHandler</h3><p> 代表一个响应特定事件而被执行的回调</p>
Antlr
https://yuzhouwan.com/posts/55501/
2015-02-02T03:09:26.000Z
2023-02-18T14:10:01.528Z
介绍 Antlr 基本概念、特性、工作机制、内部运作流程,以及踩到的一些坑。
Nginx:一款高性能的反向代理服务器
https://yuzhouwan.com/posts/200321/
2020-03-20T23:52:01.000Z
2022-10-23T08:17:06.290Z
<h2 id="Nginx-是什么?"><a href="#Nginx-是什么?" class="headerlink" title="Nginx 是什么?"></a>Nginx 是什么?</h2><blockquote>
<p><strong><a href="https://nginx.org/">Nginx</a></strong>™ [engine x] is an HTTP and reverse proxy server, a mail proxy server, and a generic TCP/UDP proxy server</p>
</blockquote>
<h2 id="环境搭建"><a href="#环境搭建" class="headerlink" title="环境搭建"></a>环境搭建</h2><h3 id="下载"><a href="#下载" class="headerlink" title="下载"></a>下载</h3><p> 在 Nginx <a href="https://nginx.org/download/">Archive</a> 下载页面,下载 <a href="https://nginx.org/download/nginx-1.13.12.tar.gz">nginx-1.13.12.tar.gz</a> 安装包</p>
<h3 id="安装依赖"><a href="#安装依赖" class="headerlink" title="安装依赖"></a>安装依赖</h3><figure class="highlight bash"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ yum -y install openssl openssl-devel</span><br><span class="line">$ yum -y install pcre-devel</span><br></pre></td></tr></tbody></table></figure>
<h3 id="编译安装"><a href="#编译安装" class="headerlink" title="编译安装"></a>编译安装</h3><figure class="highlight bash"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">$ tar zxvf nginx-1.13.12.tar.gz</span><br><span class="line"><span class="comment"># 必须要跳转到 nginx 安装目录下</span></span><br><span class="line">$ <span class="built_in">cd</span> nginx-1.13.12</span><br><span class="line">$ ./configure --prefix=/usr/local/nginx --conf-path=/usr/local/nginx/nginx.conf</span><br><span class="line">$ make -j4 && make -j4 install</span><br></pre></td></tr></tbody></table></figure>
<h3 id="启动"><a href="#启动" class="headerlink" title="启动"></a>启动</h3><figure class="highlight bash"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ <span class="built_in">cd</span> /usr/local/nginx/</span><br><span class="line">$ sbin/nginx -c /usr/local/nginx/nginx.conf</span><br></pre></td></tr></tbody></table></figure>
<figure class="highlight bash"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ ps -ef | grep nginx</span><br></pre></td></tr></tbody></table></figure>
<figure class="highlight bash"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">root 107034 1 0 Oct31 ? 00:00:00 nginx: master process sbin/nginx</span><br><span class="line">nobody 107036 107034 0 Oct31 ? 00:00:00 nginx: worker process</span><br><span class="line">nobody 107266 107265 0 Oct31 ? 00:00:00 tsar --check --apache --cpu --mem --load --io --traffic --tcp --partition --nginx --swap</span><br><span class="line">root 107270 97588 0 Oct31 pts/1 00:00:00 grep nginx</span><br></pre></td></tr></tbody></table></figure>
Presto:分布式 SQL 查询引擎
https://yuzhouwan.com/posts/200906/
2020-09-06T15:06:25.000Z
2022-07-17T06:12:36.181Z
<h2 id="Presto-是什么?"><a href="#Presto-是什么?" class="headerlink" title="Presto 是什么?"></a>Presto 是什么?</h2><blockquote>
<p><strong>Presto</strong>™ (PrestoDB™) is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.</p>
<p><strong>Presto</strong>™ (PrestoSQL™, a.k.a. Trino™) is a high performance, distributed SQL query engine for big data.</p>
</blockquote>
<div class="note success">下文将详细介绍二者的区别</div>
<h2 id="基本概念"><a href="#基本概念" class="headerlink" title="基本概念"></a>基本概念</h2><h3 id="组件"><a href="#组件" class="headerlink" title="组件"></a>组件</h3><h4 id="Coordinator"><a href="#Coordinator" class="headerlink" title="Coordinator"></a>Coordinator</h4><p> 负责管理 Worker 和 MetaStore 节点,以及接受客户端查询请求,并进行 SQL 的语法解析(Parser)、执行计划生成与优化(Planner)和查询任务的调度(Scheduler)</p>
<div class="note info">Coordinator 通过 RESTful 接口与 Client 和 Worker 交互</div>
<h4 id="Worker"><a href="#Worker" class="headerlink" title="Worker"></a>Worker</h4><p> 负责具体的查询计算和数据读写</p>
<h4 id="Discovery-Server"><a href="#Discovery-Server" class="headerlink" title="Discovery Server"></a>Discovery Server</h4><p> 负责发现集群的各个节点,用于节点间心跳监控</p>
<div class="note success">一般 Discovery Server 混布在 Coordinator 节点上,也支持单独部署</div>
Scala 实战
https://yuzhouwan.com/posts/18651/
2018-06-06T15:02:18.000Z
2022-07-17T06:11:46.715Z
<h2 id="实用技巧"><a href="#实用技巧" class="headerlink" title="实用技巧"></a>实用技巧</h2><h3 id="List"><a href="#List" class="headerlink" title="List"></a>List</h3><figure class="highlight scala"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">List</span>(<span class="number">1</span>, <span class="number">9</span>, <span class="number">2</span>, <span class="number">4</span>, <span class="number">5</span>) span (_ < <span class="number">3</span>) <span class="comment">// (List(1), List(9, 2, 4, 5)) 碰到不符合就结束</span></span><br><span class="line"></span><br><span class="line"><span class="type">List</span>(<span class="number">1</span>, <span class="number">9</span>, <span class="number">2</span>, <span class="number">4</span>, <span class="number">5</span>) partition (_ < <span class="number">3</span>) <span class="comment">// (List(1, 2), List(9, 4, 5)) 扫描所有</span></span><br><span class="line"></span><br><span class="line"><span class="type">List</span>(<span class="number">1</span>, <span class="number">9</span>, <span class="number">2</span>, <span class="number">4</span>, <span class="number">5</span>) splitAt <span class="number">2</span> <span class="comment">// (List(1, 9), List(2, 4, 5)) 以下标为分割点</span></span><br><span class="line"></span><br><span class="line"><span class="type">List</span>(<span class="number">1</span>, <span class="number">9</span>, <span class="number">2</span>, <span class="number">4</span>, <span class="number">5</span>) groupBy (<span class="number">5</span> < _) <span class="comment">// Map(false -> List(1, 2, 4, 5), true -> List(9)) 分割成 Map 对象,以 Boolean 类型为 Key</span></span><br></pre></td></tr></tbody></table></figure>
<h3 id="Iterator"><a href="#Iterator" class="headerlink" title="Iterator"></a>Iterator</h3><h4 id="grouped"><a href="#grouped" class="headerlink" title="grouped"></a>grouped</h4><figure class="highlight scala"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> scala.collection.{<span class="type">AbstractIterator</span>, mutable}</span><br><span class="line"><span class="keyword">import</span> org.apache.spark.{<span class="type">SparkConf</span>, <span class="type">SparkContext</span>}</span><br><span class="line"><span class="keyword">import</span> org.apache.spark.sql.<span class="type">SparkSession</span></span><br><span class="line"><span class="keyword">import</span> org.apache.spark.sql.<span class="type">BigquerySparkSession</span>._</span><br><span class="line"></span><br><span class="line"><span class="keyword">val</span> conf = <span class="keyword">new</span> <span class="type">SparkConf</span>()</span><br><span class="line"><span class="keyword">val</span> builder = <span class="type">SparkSession</span>.builder().config(conf).enableHiveSupport()</span><br><span class="line"><span class="keyword">val</span> spark = builder.getOrCreateBigquerySparkSession()</span><br><span class="line"><span class="keyword">val</span> df = spark.sql(<span class="string">"use db; select * from table"</span>)</span><br><span class="line"></span><br><span class="line"><span class="keyword">val</span> dataset = df.rdd.mapPartitions(iter => {</span><br><span class="line"></span><br><span class="line"> <span class="comment">// 将每个 partition 中的多行数据,以 100 为长度作为一组,进行一次批处理</span></span><br><span class="line"> iter.grouped(<span class="number">100</span>)</span><br><span class="line"> .flatMap(rows => {</span><br><span class="line"> <span class="keyword">val</span> records = <span class="keyword">new</span> mutable.<span class="type">MutableList</span>[<span class="type">String</span>]()</span><br><span class="line"> rows.foreach(row => records.add(<span class="type">JSON</span>.toJSONString(row, <span class="literal">false</span>)))</span><br><span class="line"> records</span><br><span class="line"> })</span><br><span class="line">})</span><br><span class="line"></span><br><span class="line"><span class="keyword">val</span> filteredEmptyLine = dataset</span><br><span class="line"> .filter(_ != <span class="literal">null</span>)</span><br><span class="line"> .map(<span class="type">JSON</span>.toJSONString(_, <span class="literal">false</span>))</span><br><span class="line"> .filter(_.trim.length != <span class="number">0</span>)</span><br></pre></td></tr></tbody></table></figure>
开源时序数据库 InfluxDB
https://yuzhouwan.com/posts/200315/
2020-03-15T10:58:02.000Z
2022-07-16T05:05:29.265Z
<h2 id="介绍"><a href="#介绍" class="headerlink" title="介绍"></a>介绍</h2><blockquote>
<p><strong><a href="https://docs.influxdata.com/">InfluxDB</a></strong>™ is a time series database designed to handle high write and query loads. It is an integral component of the TICK stack. InfluxDB is meant to be used as a backing store for any use case involving large amounts of timestamped data, including DevOps monitoring, application metrics, IoT sensor data, and real-time analytics.</p>
</blockquote>
<h2 id="基本概念"><a href="#基本概念" class="headerlink" title="基本概念"></a>基本概念</h2><h3 id="DataBase"><a href="#DataBase" class="headerlink" title="DataBase"></a>DataBase</h3><p> 类似于传统数据库中的 DataBase 概念</p>
<h3 id="Measurement"><a href="#Measurement" class="headerlink" title="Measurement"></a>Measurement</h3><p> 和 OLAP 中广义上的度量概念一致,部分 OLAP 数据库中又称为 Metric</p>
<h3 id="Tag"><a href="#Tag" class="headerlink" title="Tag"></a>Tag</h3><p> 和 OLAP 中广义上的维度概念一致,部分 OLAP 数据库中又称为 TagKV</p>
<h3 id="Field"><a href="#Field" class="headerlink" title="Field"></a>Field</h3><p> 数值</p>
<h3 id="Timestamp"><a href="#Timestamp" class="headerlink" title="Timestamp"></a>Timestamp</h3><p> 时间戳</p>
<h3 id="Points"><a href="#Points" class="headerlink" title="Points"></a>Points</h3><p> 数据点</p>
<h3 id="Series"><a href="#Series" class="headerlink" title="Series"></a>Series</h3><p> 数据点组成的序列</p>
<h3 id="Retention-Policy"><a href="#Retention-Policy" class="headerlink" title="Retention Policy"></a>Retention Policy</h3><p> 数据过期策略,即 TTL</p>
Real-time ML with Spark
https://yuzhouwan.com/posts/4735/
2015-08-13T11:50:21.000Z
2022-07-16T05:05:17.490Z
本文主要介绍如何使用 Apache Spark 实现实时的机器学习。
Apache Superset 二次开发
https://yuzhouwan.com/posts/743/
2017-04-03T12:59:57.000Z
2021-12-26T06:26:49.823Z
<h2 id="Apache-Superset-是什么?"><a href="#Apache-Superset-是什么?" class="headerlink" title="Apache Superset 是什么?"></a>Apache Superset 是什么?</h2><blockquote>
<p><strong><a href="https://superset.apache.org/">Apache Superset</a></strong>™ is a modern data exploration and visualization platform.</p>
</blockquote>
<h2 id="基础组件"><a href="#基础组件" class="headerlink" title="基础组件"></a>基础组件</h2><h3 id="Flask"><a href="#Flask" class="headerlink" title="Flask"></a>Flask</h3><p> <a href="https://yuzhouwan.com/posts/43687/">Python</a> 几大著名 Web 框架之一,以其轻量级,高可扩展性而著名</p>
<ul>
<li><p>Jinja2<br> 模板引擎</p>
</li>
<li><p>Werkzeug<br> WSGI 工具集</p>
</li>
</ul>
<h3 id="Gunicorn"><a href="#Gunicorn" class="headerlink" title="Gunicorn"></a>Gunicorn</h3><p> Gunicorn 是一个开源的 Python WSGI HTTP 服务器,移植于 Ruby 的 Unicorn 项目的采用 pre-fork 模式的服务器</p>
<h4 id="WSGI"><a href="#WSGI" class="headerlink" title="WSGI"></a>WSGI</h4><p> WSGI,即 Python <strong>W</strong>eb <strong>S</strong>erver <strong>G</strong>ateway <strong>I</strong>nterface,是专门用于 Python 应用程序或框架与 Web 服务器之间的一种接口,没有官方的实现,因为 WSGI 更像一个协议,只要遵照这些协议,WSGI 应用都可以在 任何服务器上运行,反之亦然</p>
<h4 id="Pre-Fork"><a href="#Pre-Fork" class="headerlink" title="Pre-Fork"></a>Pre-Fork</h4><p> 一个进程处理一个请求,基于 select 模型,所以最多一次创建 1024 个进程<br> 预先创建进程,pre-fork 采用的是预派生子进程方式,用子进程处理不同的请求,每个请求对应一个子进程,进程之间是彼此独立的<br> 一定程度上加快了进程的响应速度</p>
Apache HBase 全攻略
https://yuzhouwan.com/posts/45888/
2017-05-28T05:24:10.000Z
2021-07-18T05:44:05.021Z
介绍 Apache HBase 的基本概念、环境部署、常用命令、实战技巧、架构设计和性能优化,并记录了一些踩过的坑,及其解决方案。
Apache Calcite:一款开源 SQL 解析工具
https://yuzhouwan.com/posts/201018/
2020-10-17T16:02:26.000Z
2021-07-18T05:43:58.756Z
<h2 id="Apache-Calcite-是什么?"><a href="#Apache-Calcite-是什么?" class="headerlink" title="Apache Calcite 是什么?"></a>Apache Calcite 是什么?</h2><blockquote>
<p><strong><a href="https://github.com/apache/calcite">Apache Calcite</a></strong>™ is a dynamic data management framework.</p>
</blockquote>
<p><img data-src="/picture/calcite/calcite_mountain.jpg" alt="Calcite Mountain"></p>
<center>(图片来源:<a href="https://pixabay.com/photos/landscape-travertine-pamukkale-2380833/" target="_blank">Pixabay</a>™ 官网,已确认无版权)</center>
<h2 id="基本概念"><a href="#基本概念" class="headerlink" title="基本概念"></a>基本概念</h2><h3 id="Catelog"><a href="#Catelog" class="headerlink" title="Catelog"></a>Catelog</h3><p> 用于定义 SQL 语义相关的元数据与命名空间</p>
<h3 id="SQL-Parser"><a href="#SQL-Parser" class="headerlink" title="SQL Parser"></a>SQL Parser</h3><p> 负责将 SQL 转化成 AST(<strong>A</strong>bstract <strong>S</strong>yntax <strong>T</strong>ree)</p>
<h3 id="SQL-Validator"><a href="#SQL-Validator" class="headerlink" title="SQL Validator"></a>SQL Validator</h3><p> 负责通过 Catalog 对 AST 进行校证</p>
<h3 id="Query-Optimizer"><a href="#Query-Optimizer" class="headerlink" title="Query Optimizer"></a>Query Optimizer</h3><p> 负责将 AST 转化成物理执行计划、优化物理执行计划</p>
<h3 id="SQL-Generator"><a href="#SQL-Generator" class="headerlink" title="SQL Generator"></a>SQL Generator</h3><p> 负责将物理执行计划反向转化成 SQL 语句</p>
<h2 id="特性"><a href="#特性" class="headerlink" title="特性"></a>特性</h2><ul>
<li>支持标准 SQL 语言</li>
<li>通过适配器(<a href="https://calcite.apache.org/docs/adapter.html">Adapter</a>)可以支持连接任何数据源</li>
<li>支持丰富的关系代数(并集、交集、连接、笛卡尔积等)</li>
<li>支持对逻辑规划规则进行定制(例如 Filter 下推)</li>
<li>支持成本模型优化(<strong>CBO</strong>, <strong>C</strong>ost-<strong>B</strong>ased <strong>O</strong>ptimizer 和 <strong>RBO</strong>, <strong>R</strong>ule-<strong>B</strong>ased <strong>O</strong>ptimizer)</li>
<li>支持管理物化视图(<a href="https://calcite.apache.org/docs/materialized_views.html">Materialized view</a>)</li>
<li>支持查询流式数据</li>
<li>稳定可靠(开发迭代 10 年以上)</li>
<li>已贡献给 Apache 基金会(于 2013 年)</li>
<li>开源社区活跃(<a href="https://yuzhouwan.com/posts/5845/">Apache Druid</a>、Apache Hive、Apache Drill、<a href="https://yuzhouwan.com/posts/20644/">Apache Flink</a>、<a href="https://yuzhouwan.com/posts/45888/#Phoenix-%E5%91%BD%E4%BB%A4">Apache Phoenix</a> 等项目均在使用)</li>
</ul>
<div class="note success">Apache Calcite 借助开源的 JavaCC 完成 SQL 解析,将 SQL 语句转化为 Java 代码</div>
<div class="note success">Apache Calcite 还使用了轻量级 Janino 编译运行时 Java 代码,以便灵活地管理元数据</div>
一幅持续扩展的物联网思维导图
https://yuzhouwan.com/posts/201220/
2020-12-20T03:50:00.000Z
2021-07-18T05:36:44.738Z
<h2 id="一幅持续扩展的物联网思维导图"><a href="#一幅持续扩展的物联网思维导图" class="headerlink" title="一幅持续扩展的物联网思维导图"></a>一幅持续扩展的物联网思维导图</h2><p><img data-src="/picture/iot/iot_mind_map.png" alt="一幅持续扩展的物联网思维导图"></p>
<center>(利用 MindNode™ 绘制而成)</center>
<h2 id="更新记录"><a href="#更新记录" class="headerlink" title="更新记录"></a>更新记录</h2><div class="table-container">
<table>
<thead>
<tr>
<th style="text-align:center">日期</th>
<th style="text-align:center">更新</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:center">2020-12-20</td>
<td style="text-align:center">初次发布,包含定义、行业、传感器、数据分析、安全、协议、机构、历史等内容</td>
</tr>
<tr>
<td style="text-align:center">2020-12-21</td>
<td style="text-align:center">扩充了 1982 年 ~ 2016 年之间关于物联网的大事件</td>
</tr>
<tr>
<td style="text-align:center">2020-12-22</td>
<td style="text-align:center">增加 eSIM 的应用场景</td>
</tr>
<tr>
<td style="text-align:center">2020-12-25</td>
<td style="text-align:center">增加边缘计算平台和物联网操作系统</td>
</tr>
<tr>
<td style="text-align:center">2020-12-26</td>
<td style="text-align:center">增加物联网设备数量统计信息</td>
</tr>
<tr>
<td style="text-align:center">2020-12-28</td>
<td style="text-align:center">增加全球移动通讯系统协会</td>
</tr>
</tbody>
</table>
</div>
为什么 JavaScript 对服务端开发很重要?
https://yuzhouwan.com/posts/19989/
2014-11-04T02:40:12.000Z
2021-07-18T05:36:44.717Z
<h2 id="开发人员用一种语言就能编写整个-Web-应用"><a href="#开发人员用一种语言就能编写整个-Web-应用" class="headerlink" title="开发人员用一种语言就能编写整个 Web 应用"></a>开发人员用一种语言就能编写整个 Web 应用</h2><p> 可以减少开发客户端和服务端时所需的语言切换(Clojure, <a href="https://github.com/clojure/clojurescript/">ClojureScript</a> 一样的道理)<br> 代码可以再客户端和服务端中共享(表单校验或游戏逻辑中使用同样的代码)</p>
<h2 id="JSON-是目前非常流行的数据交换格式"><a href="#JSON-是目前非常流行的数据交换格式" class="headerlink" title="JSON 是目前非常流行的数据交换格式"></a>JSON 是目前非常流行的数据交换格式</h2><p> <a href="https://www.json.org/json-zh.html">JSON</a> 还是 JavaScript 原生的</p>
<h2 id="有些-NoSQL-数据库中用的就是-JavaScript-语言"><a href="#有些-NoSQL-数据库中用的就是-JavaScript-语言" class="headerlink" title="有些 NoSQL 数据库中用的就是 JavaScript 语言"></a>有些 NoSQL 数据库中用的就是 JavaScript 语言</h2><p> MongoDB 的管理和查询语言都是 JavaScript<br> CouchDB 的 Map/reduce 也是 JavaScript</p>
<h2 id="JavaScript-是一门编译目标语言"><a href="#JavaScript-是一门编译目标语言" class="headerlink" title="JavaScript 是一门编译目标语言"></a>JavaScript 是一门编译目标语言</h2><p> <a href="https://github.com/jashkenas/coffeescript/wiki/List-of-languages-that-compile-to-JS/">List of languages that compile to JS</a></p>
<h2 id="Node-用的虚拟机(V8)会紧跟-ECMAScirpt-标准"><a href="#Node-用的虚拟机(V8)会紧跟-ECMAScirpt-标准" class="headerlink" title="Node 用的虚拟机(V8)会紧跟 ECMAScirpt 标准"></a>Node 用的虚拟机(V8)会紧跟 ECMAScirpt 标准</h2><p> 在 Node 中如果想用新的 JavaScript 语言特性,不用等到所有浏览器都支持</p>
散列表
https://yuzhouwan.com/posts/31130/
2014-12-13T12:55:00.000Z
2021-07-18T05:36:44.717Z
<h2 id="什么是-散列表"><a href="#什么是-散列表" class="headerlink" title="什么是 散列表?"></a>什么是 散列表?</h2><p> 散列表(Hash Table,即哈希表)是根据键值(Key)而直接进行访问的数据结构。也就是说,它通过把关键码值映射到表中一个位置来访问记录,以加快查找的速度。这个映射函数叫做散列函数,存放记录的数组叫做散列表</p>
<h2 id="为什么要有-散列表"><a href="#为什么要有-散列表" class="headerlink" title="为什么要有 散列表?"></a>为什么要有 散列表?</h2><h3 id="可以提供快速的插入操作和查找操作"><a href="#可以提供快速的插入操作和查找操作" class="headerlink" title="可以提供快速的插入操作和查找操作"></a>可以提供快速的插入操作和查找操作</h3><p> 不论哈希表中有多少数据,插入和删除(有时包括侧除)只需要接近常量的时间即 <code>O(1)</code> 的时间级<br> 实际上,这只需要几条机器指令<br> 哈希表运算得非常快,在计算机程序中,如果需要在一秒种内查找上千条记录通常使用哈希表(例如拼写检查器),而树的操作通常需要 <code>O(N)</code> 的时间级</p>
<h3 id="编程实现相对容易"><a href="#编程实现相对容易" class="headerlink" title="编程实现相对容易"></a>编程实现相对容易</h3><h2 id="散列表工作机制"><a href="#散列表工作机制" class="headerlink" title="散列表工作机制"></a>散列表工作机制</h2><h3 id="存储"><a href="#存储" class="headerlink" title="存储"></a>存储</h3><p> 使用一个数组实现的无序符号表<br> 意味着,数组创建后,难于扩展(某些哈希表被基本填满时,性能下降得非常严重)<br> 要么预设足够的空间,要么定期将数据迁移到更大的哈希表</p>
重拾 Golang
https://yuzhouwan.com/posts/191026/
2019-10-25T16:06:52.000Z
2021-07-18T05:36:44.716Z
<h2 id="什么是-Golang?"><a href="#什么是-Golang?" class="headerlink" title="什么是 Golang?"></a>什么是 Golang?</h2><blockquote>
<p><strong><a href="https://golang.org/">Go</a></strong>™ is an open source programming language that makes it easy to build simple, reliable, and efficient software.</p>
</blockquote>
<h2 id="特性"><a href="#特性" class="headerlink" title="特性"></a>特性</h2><h3 id="类别"><a href="#类别" class="headerlink" title="类别"></a>类别</h3><ul>
<li>静态语言</li>
<li>编译型语言</li>
</ul>
<h3 id="优点"><a href="#优点" class="headerlink" title="优点"></a>优点</h3><ul>
<li>语言层面支持并发</li>
<li>无依赖,直译机器码</li>
<li>内置 <code>runtime</code>,支持 GC</li>
<li>可跨平台编译</li>
<li>支持内嵌 C</li>
<li>丰富的标准库</li>
<li>学习曲线低</li>
</ul>
<h3 id="缺点"><a href="#缺点" class="headerlink" title="缺点"></a>缺点</h3><ul>
<li>接口是枚举类型</li>
<li><code>import</code> 包不支持版本</li>
<li><code>goroutine</code> 一旦启动,切换将不受程序控制</li>
</ul>
<h2 id="环境配置"><a href="#环境配置" class="headerlink" title="环境配置"></a>环境配置</h2><h3 id="安装"><a href="#安装" class="headerlink" title="安装"></a>安装</h3><p> 根据操作系统,在 <a href="https://golang.org/dl/">Download</a> 页面下载对应的安装包,进行安装</p>
<h4 id="MacOS"><a href="#MacOS" class="headerlink" title="MacOS"></a>MacOS</h4><figure class="highlight bash"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 安装完成后,iTerm 中看到可以执行 go 命令了</span></span><br><span class="line">$ <span class="built_in">which</span> go</span><br><span class="line"> /usr/local/go/bin/go</span><br></pre></td></tr></tbody></table></figure>
<h4 id="Linux"><a href="#Linux" class="headerlink" title="Linux"></a>Linux</h4><figure class="highlight bash"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ wget https://dl.google.com/go/go1.14.4.linux-amd64.tar.gz</span><br><span class="line">$ sudo tar -C /usr/local/ -xzvf go1.14.4.linux-amd64.tar.gz</span><br></pre></td></tr></tbody></table></figure>
<h3 id="配置"><a href="#配置" class="headerlink" title="配置"></a>配置</h3><figure class="highlight bash"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 环境变量</span></span><br><span class="line">$ vim ~/.bashrc</span><br><span class="line"> <span class="built_in">export</span> GOROOT=/usr/local/go</span><br><span class="line"> <span class="built_in">export</span> PATH=<span class="variable">$PATH</span>:<span class="variable">$GOROOT</span>/bin</span><br><span class="line"></span><br><span class="line"><span class="comment"># 工作目录</span></span><br><span class="line"><span class="comment"># bin: 存放可执行文件</span></span><br><span class="line"><span class="comment"># pkg: 存放编译好的库文件</span></span><br><span class="line"><span class="comment"># src: 存放 go 的源文件</span></span><br><span class="line">$ <span class="built_in">mkdir</span> -p ~/code/gopath</span><br><span class="line">$ vim ~/.bashrc</span><br><span class="line"> <span class="built_in">export</span> GOROOT=/usr/local/go</span><br><span class="line"> <span class="built_in">export</span> GOPATH=~/code/gopath</span><br><span class="line"> <span class="built_in">export</span> PATH=<span class="variable">$PATH</span>:<span class="variable">$GOROOT</span>/bin:<span class="variable">$GOPATH</span>/bin</span><br><span class="line"></span><br><span class="line">$ <span class="built_in">source</span> ~/.bashrc</span><br></pre></td></tr></tbody></table></figure>
Session
https://yuzhouwan.com/posts/48905/
2014-11-15T07:55:36.000Z
2021-07-18T05:36:44.710Z
<h2 id="Session-是什么?"><a href="#Session-是什么?" class="headerlink" title="Session 是什么?"></a>Session 是什么?</h2><p> 代表服务器与浏览器之间的一次会话过程,这个过程可以是连续的,也可以是时断时续的。而在 Web 开发语境下,则指一类用来在客户端与服务器之间保持状态的解决方案</p>
<h2 id="多样的存在形式"><a href="#多样的存在形式" class="headerlink" title="多样的存在形式"></a>多样的存在形式</h2><ul>
<li><a href="https://yuzhouwan.com/posts/190413/">Java</a></li>
</ul>
<figure class="highlight java"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">javax.servlet.http.HttpSession</span><br></pre></td></tr></tbody></table></figure>
<ul>
<li><a href="https://yuzhouwan.com/posts/43687/">Python</a></li>
</ul>
<figure class="highlight python"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">s = requests.session()</span><br></pre></td></tr></tbody></table></figure>
<ul>
<li>PHP</li>
</ul>
<figure class="highlight php"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="variable">$_session</span></span><br></pre></td></tr></tbody></table></figure>
<ul>
<li>Hibernate</li>
</ul>
<figure class="highlight java"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">org.hibernate <span class="keyword">interface</span> <span class="title class_">Session</span></span><br></pre></td></tr></tbody></table></figure>
<ul>
<li>WebLogic</li>
</ul>
<figure class="highlight bash"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">Weblogic Server session</span><br></pre></td></tr></tbody></table></figure>
<ul>
<li>JSP</li>
</ul>
<figure class="highlight java"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">HttpSession</span><br></pre></td></tr></tbody></table></figure>
SSO
https://yuzhouwan.com/posts/5517/
2014-11-16T09:28:19.000Z
2021-07-18T05:36:44.706Z
<h2 id="SSO-是什么?"><a href="#SSO-是什么?" class="headerlink" title="SSO 是什么?"></a>SSO 是什么?</h2><p> SSO(<strong>S</strong>ingle <strong>S</strong>ign-<strong>o</strong>n),即单点登录,指在一个多系统共存的环境下,用户在一处登录后,就不用在其他系统中重新登录,也就是说用户的一次登录能得到其他所有系统的信任</p>
<h2 id="为什么要有-SSO?"><a href="#为什么要有-SSO?" class="headerlink" title="为什么要有 SSO?"></a>为什么要有 SSO?</h2><p> 尤其,大型网站背后是成百上千的子系统,用户一次操作或交易可能涉及到几十个子系统的协作<br> 如果每次子系统都需要用户认证,不仅用户会疯掉,各子系统也会为这种重复认证的逻辑搞疯掉</p>
Node 模块
https://yuzhouwan.com/posts/23363/
2014-11-05T01:01:15.000Z
2021-07-18T05:36:44.695Z
<h2 id="Node-js-是什么?"><a href="#Node-js-是什么?" class="headerlink" title="Node.js 是什么?"></a>Node.js 是什么?</h2><blockquote>
<p><a href="https://nodejs.org/en/">Node.js</a>® is a JavaScript runtime built on Chrome’s V8 JavaScript engine.</p>
</blockquote>
<h2 id="为什么要有-Node-模块?"><a href="#为什么要有-Node-模块?" class="headerlink" title="为什么要有 Node 模块?"></a>为什么要有 Node 模块?</h2><p> 模块,是 Node 让代码易于重用的一种组织和包装方式</p>
Hadoop RPC 源码领略
https://yuzhouwan.com/posts/60504/
2014-11-16T10:03:04.000Z
2021-07-18T05:36:44.686Z
主要利用一次从 HDFS 下载文件的任务,走读了一遍 Hadoop 内关于 RPC 的源码。文章最后一章提供了一个自己实现的 RPC 框架的代码,有兴趣的读者可以看看。
DIRT
https://yuzhouwan.com/posts/47609/
2014-11-03T06:15:02.000Z
2021-07-18T05:36:44.682Z
简述了 DIRT 相关概念。
Apache Storm 简介
https://yuzhouwan.com/posts/13977/
2015-04-22T11:09:42.000Z
2021-07-18T05:36:44.678Z
<h2 id="Storm-是什么?"><a href="#Storm-是什么?" class="headerlink" title="Storm 是什么?"></a>Storm 是什么?</h2><p> <strong>Apache Storm</strong>™ is a free and open source distributed realtime computation system. Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing.</p>
<h2 id="为什么要有-Storm?"><a href="#为什么要有-Storm?" class="headerlink" title="为什么要有 Storm?"></a>为什么要有 Storm?</h2><h3 id="分布式"><a href="#分布式" class="headerlink" title="分布式"></a>分布式</h3><p> 具备经济、快速、可靠、易扩充、数据共享、设备共享、通讯方便、灵活等分布式所具备的特性</p>
<h3 id="可扩展性"><a href="#可扩展性" class="headerlink" title="可扩展性"></a>可扩展性</h3><p> 计算在多线程、进程 和 服务器之间并行进行</p>
<h3 id="高可靠性"><a href="#高可靠性" class="headerlink" title="高可靠性"></a>高可靠性</h3><p> 能管理工作进程 和 节点的故障<br> 消息处理,能得到一次完成处理的保证</p>
<h3 id="编程模型简单"><a href="#编程模型简单" class="headerlink" title="编程模型简单"></a>编程模型简单</h3><p> 降低了并行批处理复杂性</p>
<h3 id="高效实时"><a href="#高效实时" class="headerlink" title="高效实时"></a>高效实时</h3><p> 利用 ZeroMQ 保证了消息的快速处理</p>
<h3 id="支持热部署"><a href="#支持热部署" class="headerlink" title="支持热部署"></a>支持热部署</h3><p> 加速应用开发</p>
Apache Storm 与 Kafka 的整合应用
https://yuzhouwan.com/posts/25015/
2015-05-10T06:28:08.000Z
2021-07-18T05:36:44.677Z
<p><br></p>
<p> Apache Storm 和 Apache Kafka 相关知识,可以分别参考《<a href="https://yuzhouwan.com/posts/13977/">Apache Storm 简介</a>》和《<a href="https://yuzhouwan.com/posts/26002/">Apache Kafka 分布式消息队列框架</a>》</p>
<h2 id="搭建-Storm-和-Kafka-的基础环境"><a href="#搭建-Storm-和-Kafka-的基础环境" class="headerlink" title="搭建 Storm 和 Kafka 的基础环境"></a>搭建 Storm 和 Kafka 的基础环境</h2><h3 id="搭建-Storm-Kafka-集群"><a href="#搭建-Storm-Kafka-集群" class="headerlink" title="搭建 Storm / Kafka 集群"></a>搭建 <a href="https://yuzhouwan.com/posts/39683/#Storm">Storm</a> / <a href="https://yuzhouwan.com/posts/39683#Kafka">Kafka</a> 集群</h3><p> 具体安装步骤,详见我的另一篇博客《<a href="https://yuzhouwan.com/posts/39683/#Storm">Apache Eagle</a>》</p>
<h3 id="启动-Kafka"><a href="#启动-Kafka" class="headerlink" title="启动 Kafka"></a>启动 Kafka</h3><ul>
<li>Start the zookeeper and kafka server</li>
</ul>
<figure class="highlight bash"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ bin/zookeeper-server-start.sh config/zookeeper.properties</span><br><span class="line">$ bin/kafka-server-start.sh config/server.properties</span><br></pre></td></tr></tbody></table></figure>
<ul>
<li>Create a topic</li>
</ul>
<figure class="highlight bash"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic my-replicated-topic</span><br></pre></td></tr></tbody></table></figure>
<ul>
<li>List topics</li>
</ul>
<figure class="highlight bash"><table><tbody><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ bin/kafka-topics.sh --list --zookeeper localhost:2181</span><br></pre></td></tr></tbody></table></figure>
Apache IoTDB:一款面向物联网的数据库
https://yuzhouwan.com/posts/201221/
2020-12-21T13:35:06.000Z
2021-07-18T05:36:44.671Z
<h2 id="Apache-IoTDB-是什么?"><a href="#Apache-IoTDB-是什么?" class="headerlink" title="Apache IoTDB 是什么?"></a>Apache IoTDB 是什么?</h2><blockquote>
<p><strong><a href="https://github.com/apache/iotdb">Apache IoTDB</a></strong>™ (Database for Internet of Things) is an IoT native database with high performance for data management and analysis, deployable on the edge and the cloud.</p>
</blockquote>
<p><img data-src="/picture/iotdb/apache_iotdb_logo.png" alt=""></p>
<center>(图片来源:<a href="https://iotdb.apache.org/" target="_blank">Apache IoTDB</a>™ 官网)</center>
<h2 id="特性"><a href="#特性" class="headerlink" title="特性"></a>特性</h2><ul>
<li>高吞吐量读写</li>
<li>高效的目录结构</li>
<li>丰富的查询语义</li>
<li>硬件成本低</li>
<li>灵活的部署</li>
<li>与开源生态系统的紧密集成</li>
</ul>
<h2 id="应用场景"><a href="#应用场景" class="headerlink" title="应用场景"></a>应用场景</h2><ul>
<li>高端制造业</li>
<li>本地控制器服务器</li>
<li>云数据管理</li>
</ul>
Aapche Drill:一款分布式查询引擎
https://yuzhouwan.com/posts/201025/
2020-10-24T16:36:00.000Z
2021-07-18T05:36:44.658Z
<h2 id="Aapche-Drill-是什么?"><a href="#Aapche-Drill-是什么?" class="headerlink" title="Aapche Drill 是什么?"></a>Aapche Drill 是什么?</h2><blockquote>
<p><strong><a href="https://github.com/apache/drill">Apache Drill</a></strong>™ is a distributed MPP query layer that supports SQL and alternative query languages against NoSQL and Hadoop data storage systems. It was inspired in part by <a href="http://research.google.com/pubs/pub36632.html">Google’s Dremel</a>.</p>
</blockquote>
<p><img data-src="/picture/drill/drill.jpg" alt="Drill"></p>
<center>(图片来源:<a href="https://www.pexels.com/zh-cn/photo/87236/" target="_blank">Pexels</a>™ 官网,已确认无版权)</center>
<h2 id="优缺点"><a href="#优缺点" class="headerlink" title="优缺点"></a>优缺点</h2><h3 id="优势"><a href="#优势" class="headerlink" title="优势"></a>优势</h3><ul>
<li>支持自定义的嵌套数据结构</li>
<li>兼容 Hive(包括 Hive 的 UDF,且支持自定义 UDF)</li>
<li>高性能、低延迟的 SQL 查询</li>
<li>支持多数据源(插件化,包括 <a href="https://yuzhouwan.com/posts/26002/">Apache Kafka</a>、<a href="https://yuzhouwan.com/posts/45888/">Apache HBase</a>、Apache Hive、OpenTSDB、S3 <a href="https://drill.apache.org/docs/connect-a-data-source-introduction/">等</a>)</li>
</ul>
<div class="note info">UDF(User Defined Funcation):用户定义普通函数,只作用于单行记录</div>
<div class="note info">UDAF(User Defined Aggregation Funcation):用户定义聚合函数,只作用于多行记录</div>
<div class="note info">UDTF(User Defined Table Generating Funcation):用户定义表生成函数,可以输入一行记录输出多行记录</div>
<h3 id="劣势"><a href="#劣势" class="headerlink" title="劣势"></a>劣势</h3><ul>
<li>与标准 SQL 略有不同</li>
<li>外部依赖较多(基于 <a href="https://yuzhouwan.com/posts/31915/">Apache ZooKeeper</a> 实现分布式、基于 <a href="https://yuzhouwan.com/posts/201018/">Apache Calcite</a> 实现 SQL 解析)</li>
<li>比较小众,相关资料缺乏</li>
</ul>