Suggested Readings: Randy: 1) https://openreview.net/pdf?id=B1J_rgWRW Is a nice paper with some theorems, but might be too hard. 2) https://arxiv.org/abs/1608.03287 Is a bit easier I think, and also mathematical. 3) https://arxiv.org/abs/1611.03530 Is a paper I discussed in my talk. It is pretty easy and provides a proof of a single theorem that might be of interest. I think that would be a nice paper for a student to look at that provides a bridge from the presentation that I gave. Sarkis: 1) https://www.cambridge.org/core/journals/acta-numerica/article/approximation-theory-of-the-mlp-model-in-neural-networks/18072C558C8410C4F92A82BCC8FC8CF9 If you cannot access Acta Numerica, you can also get the paper easily on google scholar Vladimir: https://arxiv.org/abs/1706.03301 https://arxiv.org/abs/1606.09375 Elisa: 1) George Cybenko. "Approximation by superposition of sigmoidal function". Mathematics of control signals and systems, 2(4):303-314, 1989. (Elisa talk based on this paper) 2) Baum Eric B. and David Haussler. "What size net gives valid generalization?"/ Advances in neural information processing systems, 1989 3) Makhoul, John, Richard Schwartz and Amron El-Jaroudi. "Classification capabilities of two-layer neural nets". International Conference on Acoustics, Speech, and Signal Processing, IEEE, 1989 4) Barron, Andrew R. "Approximation and estimation bounds for artificial networks". Machine Leaning 14.1, 1994.