大数据分析中的算法 (2023年春季)

本课程考核包括平时作业和程序，期中大项目，期末大项目，请谨慎选课
上课地点：理教402
外院系本科生未选上课的同学请邮件和微信告知学号
2020年春季课程回放视频
- 华文慕课平台，点击此链接
课程代码：00136720 (本科生），00100863 （本研合）
课程内容：侧重数据分析中的数值代数和最优化算法
教师信息：文再文，wenzw at pku dot edu dot cn
助教信息：彭永力，柯志发
课程微信群（下面二维码3月10日失效）

上课时间：
- 每周周一3~4节(10:10am - 12:00am)，双周周四1~2节(8:00am - 9:50am)
- 老师答疑时间: 单独和老师联系或者每次课后
- 助教答疑时间: 根据反馈待定
- 北京大学校历
先修课程要求:
- 年级要求: 大三，大四，研究生
- 不是必须先修但有帮助: 数值代数，最优化（凸优化），概率论
- 会Matlab或者Python程序编写
其它信息：
- 大数据分析中的算法 (2022年春季)
- 凸优化 (2022年秋季)
参考材料：
课程信息（大致安排）
- Acknowledgement: quite a few materials are taken from slides or lecture notes online.
  Please email me if the usage of any part is not appropriate and they will be deleted immediately.
- 第3周，3月6日，课程简介: lecture notes
- 第4周，3月13日，线性规划，二次锥规划，半定规划简介: lecture notes
  read: “Convex optimization”, Stephen Boyd and Lieven Vandenberghe, chapters 2 and 3
  思考题：有哪些问题和应用可以化成线性规划，二次锥规划，半定规划？模型语言： CVX, YALMIP
  LP, SOCP, SDP典型算法软件: SDPT3, MOSEK, CPLEX, GUROBI
  Prof. Boyd lecture notes on Disciplined convex programming and CVX
  “Convex optimization”, Stephen Boyd and Lieven Vandenberghe, chapters 4 and 5
- 第4周，3月16日，凸优化对偶理论: lecture notes
- 第5周，3月20日，线性规划单纯形算法，内点算法: lecture notes
  “Numerical Optimization”, Jorge Nocedal and Stephen Wright, Springer, chapters 13 and 14
- 第6周，3月27日，压缩感知和稀疏优化基本理论: lecture notes
  references: An Introduction to Compressive Sensing
  E. Candes and T. Tao. Decoding by linear programming. IEEE Transactions on Information Theory, 51:4203–4215, 2005
  Compressive Sensing Resources
- 第6周，3月30日，稀疏优化的算法 lecture notes
  “Proximal Algorithms”, N. Parikh and S. Boyd, Foundations and Trends in Optimization, 1(3):123-231, 2014.
  Amir Beck, Marc Teboulle, A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems
  “Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers”, S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, Foundations and Trends in Machine Learning, 3(1):1–122, 2011
  Junfeng Yang, Yin Zhang, Alternating direction algorithms for l1-problems in Compressed Sensing, SIAM Journal on Scientific Computing
- 第6周，4月1日，推荐系统与低秩矩阵恢复的算法
- 第7周，4月3日，推荐系统与低秩矩阵恢复的算法 lecture notes
  SVD: read section 2.5 in “Matrix Computations”, Gene H. Golub and Charles F. Van Loan, The Johns Hopkins University Press
  chapter 9 in “Mining Massive Data Sets”, Stanford University
  “Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization”, Benjamin Recht, Maryam Fazel, Pablo A. Parrilo
- 第7周，4月8日， optimal transport, lecture notes
  slides and computational resources on optimal transport
- 第8周，4月10日， optimal transport
- 第8周，4月13日，线性整数规划选讲lecture notes
- 第9周，4月17日，线性整数规划选讲
  网络流问题选讲 lecture notes
  图和网络流问题: 最短路径问题 lecture notes
  “Network Flows: Theory, Algorithms, and Applications”, Ahuja, Magnanti, and Orlin
  图和网络流问题: 最大流问题，组合优化与线性规划 lecture notes
  “Network Flows: Theory, Algorithms, and Applications”, Ahuja, Magnanti, and Orlin
  链接分析： page rank lecture notes
  chapter 5, link analysis, Mining of Massive Datasets
- 第10周，4月24日，Submodular 优化与数据挖掘 lecture notes
  Learning with Submodular Functions: A Convex Optimization Perspective, Francis Bach
  Learning with Submodular Functions, Francis Bach
  EE596B - Submodular Functions, Optimization, and Applications to Machine Learning, Prof. Jeff A. Bilmes
- 第10周，4月27日，随机优化算法 lecture notes
  Joel A Tropp, An introduction to matrix concentration inequalities
  Optimization Methods for Large-Scale Machine Learning, Léon Bottou, Frank E. Curtis, Jorge Nocedal
  Introductory Lectures on Stochastic Optimization, by John Duchi
  Deep Learning, Ian Goodfellow and Yoshua Bengio and Aaron Courville
  Optimization for deep learning: theory and algorithms, Ruoyu Sun
  New insights and perspectives on the natural gradient method, James Martens
- 第11周，5月1日，放假取消
- 第12周，5月8日，随机优化算法，随机特征值分解算法
- 第12周，5月11日，随机特征值分解算法 lecture notes
  Petros Drineas, Ravi Kannan, and Michael W. Mahoney, Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication, SIAM J. Comput., 36(1), 132–157
  Petros Drineas, Ravi Kannan, and Michael W. Mahoney, Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix, SIAM J. Comput., 36(1), 158–183
  N. Halko, P. G. Martinsson, and J. A. Tropp, Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions,SIAM Rev., 53(2), 217–288.
- 第13周，5月15日，相位恢复 lecture notes
  E. J. Candes, Y. Eldar, T. Strohmer and V. Voroninski. Phase retrieval via matrix completion. SIAM J. on Imaging Sciences 6(1), 199–225.
  E. J. Candès, X. Li and M. Soltanolkotabi. Phase retrieval via Wirtinger flow: theory and algorithms. IEEE Transactions on Information Theory 61(4), 1985–2007.
  J. Sun, Q. Qu and J. Wright, A Geometric Analysis of Phase Retrieval, Foundations of Computational Mathematics, Vol. 18, pages 1131–1198
  A. Singer, Y. Shkolnisky, Three-Dimensional Structure Determination from Common Lines in Cryo-EM by Eigenvectors and Semidefinite Programming, SIAM Journal on Imaging Sciences, 4 (2), pp. 543-572 (2011).
  L. Wang, A. Singer, Z. Wen, ‘‘Orientation Determination from Cryo-EM images using Least Unsquared Deviations”, SIAM Journal on Imaging Sciences, 6(4), pp. 2450–2483 (2013).
- 第14周，5月22日，高维数据降维 lecture notes
  A Global Geometric Framework for Nonlinear Dimensionality Reduction, Joshua B. Tenenbaum, Vin de Silva, John C. Langford
  Unsupervised Learning of Image Manifolds by Semidefinite Programming, KILIAN Q. WEINBERGER AND LAWRENCE K. SAUL
  A Duality View of Spectral Methods for Dimensionality Reduction, Lin Xiao, Jun Sun, Stephen Boyd
  Dimensionality Reduction: A Comparative Review, Laurens van der Maaten, Eric Postma, Jaap van den Herik
  support vector machine lecture notes
  LIBLINEAR: A Library for Large Linear Classification
  LIBSVM: A library for support vector machines
  Working set selection using the second order information for training SVM R.-E. Fan, P.-H. Chen, and C.-J. Lin.
  An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods, CRISTIANINI, Nello, and John SHAWE-TAYLOR, Cambridge, UK: Cambridge University Press
- 第14周，5月25日，Markov Decision Process lecture notes
  Richard S. Sutton and Andrew G. Barto, Reinforcement Learning: An Introduction
  Prof. Dimitri Bertsekas's lecture slides
  Dimitri P. Bertsekas, Abstract Dynamic Programming
- 第15周，5月29日，TD learning and Q-Learning lecture notes
- 第16周，6月5日，Policy gradient methods lecture notes
- 第16周，6月8日，Policy gradient methods

作业和课程项目，重要日期和提交内容：
- 如果没有明确作业文件和来源，下面习题选自教材：“最优化：建模、算法与理论”，刘浩洋, 户将, 李勇锋，文再文
- 3月20日上课前, 作业一: 习题：4.2, 4.4, 4.6, 4.7
- 3月30日上课前, 作业二: 习题： 5.1, 5.2, 5.5, 5.11
- 4月8日上课前，作业三: 习题： 8.2, 8.6, 8.13, 8.15
- 4月17日晚12点前, 课程项目一
   demo: sparsel1example.m (与课程项目一无直接关系)
  作业文件打包，文件名为 “proj1-name.zip” ，发email给助教 (pkuopt@163.com)
- 选做: 4月24日晚12点前, project-ot.pdf
- 4月27日上课前，国铁集团列车排班整数规划建模，调用求解器求解
  第9页Project Questions 1
- 5月11日晚12点, 习题： 8.20, 8.21，以及程序题 homework-sto.pdf
- 5月15日晚12点前, homework-svd.pdf
- 5月25日晚12点前, homework-phase.pdf
- 6月22日晚12点前，项目二书面报告 (包括latex源文件，程序等等）打包，文件名为 “proj2-name1-name2.zip” ，发email给助教 (pkuopt@163.com)
课程项目
- 课程项目二: 请见下面课程项目二细节
成绩评定
- 迟交一天(24小时）打折10%，不接受晚交4天的作业和项目（任何时候理由都不接受）
- 大作业，包括习题和程序： 40%
- 课程项目一： 30%
- 课程项目二： 30%
- 作业要求：i) 计算题要求写出必要的推算步骤，证明题要写出关键推理和论证。数值试验题应该同时提交书面报告和程序，其中书面报告有详细的推导和数值结果及分析。 ii) 可以同学间讨论或者找助教答疑，但不允许在讨论中直接抄袭，应该过后自己独立完成。 iii) 严禁从其他学生，从互联网，从往年的答案，其它课程等等任何途径直接抄袭。iv) 如果有讨论或从其它任何途径取得帮助，请列出来源。

期末课程项目

分组: 自由组合, 不超过2人一组，可以一个人
课题: 国铁集团列车排班整数规划求解
课题具体描述文件，具体内容会持续更新
评分准则
- 完成要求的任务。
- 鼓励提出原创性算法和分析。
- 书面报告：每一组提交一份读书报告 (包括latex源文件 )。
- 6月22日晚12点（不接受晚交报告），书面报告 (包括latex源文件，程序等等）打包, 发email给助教 (pkuopt@163.com)
  提交的文件请全部打包，文件名为 “proj2-name1-name2.zip”.

可选加分项目

数学优化软件的开发
- 内容包括凸优化的一些典型算法、流形优化、非线性规划等等
- 典型任务参考：Software implementaion for the proximal gradient methods
- 编程语言： C++
- 具体内容可咨询刘浩洋: liuhaoyang at pku dot edu dot cn
- 提供助研岗位，具体待遇面谈
“大数据分析中的算法”教材编写草稿
- 将课程PPT扩展成更加详细的文字版本，添加具体的问题介绍，典型算法介绍，典型的理论结果，详细的案例分析和数值结果。

挑战项目 (持续更新)

实现线性规划单纯形法，能成功求解netlib测试集里所有问题:
LP test problems from netlib
成功求解：判断是否有可行解，是否无界，最优性条件(primal and dual infeasibility, duality gap)的违反度是否达到1e-6以下。参考 SDPT3 user guide 里sec. 3.3（第11-12页).
故事：You have to figure out who your customer is going to be – An interview with Bob Bixby
实现线性规划内点法，能成功求解netlib测试集里所有问题（标准如上题）。
设计能充分利用GPU并行的线性规划单纯形法或内点法，测试性能是否有显著提升。可以挑一个大规模线性规划例子测试。
选择一个问题：压缩感知，低秩矩阵恢复，鲁棒主成分分析，相位恢复，社区检测，读懂凸优化模型跟原始问题解的等价性证明
选择一个问题：低秩矩阵恢复，鲁棒主成分分析，相位恢复，社区检测，考虑其非凸模型，读懂其局部极小点性质相关问题论文
证明ADAM 算法的收敛性性质
证明policy gradient 方法的收敛性质
选择一个线性代数问题，设计随机算法
选择一个机器学习模型或算法，建立半定规划模型，分析理论性质
利用深度学习/强化学习求解一些超大规模的连续优化问题
利用深度学习/强化学习求解离散优化问题