A glance at the calendar reminds me that this week marks 20 years since I took the Advanced Placement Calculus AB exam. Calculus was one of those courses that seemed so foreboding and intimidating from a distance, but once engaged actually came to me quite quickly. I think it helped that I had a great book: Calculus, 3rd edition by Larson, Hostetler, and Edwards. It was a big book, with its leafy cover and a white spine with "Calculus" written in large lettering. I liked the introductory chapter which showed what you could do with calculus that was difficult to do with arithmetic, the explanations of differential and integral calculus were so clear, the applications realistic, and the illustrations descriptive but not overpowering by too much color or text. This book is now in its 7th edition, and the description on Amazon says that it is "the only calculus text ever to increase its sales and market share in each edition through seven editions". I wanted to use that book as a reference during college, but I had borrowed my brother's book for Calculus AB/BC (he had used the same book for his AP Calculus course four years prior!) and my brother wanted it back. That book spoiled me, and I became very picky about the type of Calculus book I would use and keep. My search for the right Calculus book began.

When I was in college (undergrad) I used Gilbert Strang's Calculus book, and I hated it. Gilbert Strang is a very well-known mathematician and teacher at MIT, best known for his work on linear algebra. I haven't read his linear algebra books, but other people who I trust tell me that they're really good. I did not like his calculus book at all — it was too conversational and shallow to be of any use for class, much less as a reference. I sold it as soon as I was done with the Calculus sequence. Most contemporary Calculus textbooks are awful and I have not been impressed with any of them. Important concepts are too often watered down, elaborate color pictures are added just because, the book is integrated with CDROMs or websites "just because" while missing the actual content, and the beauty of the Calculus is lost. Several times I've walked into a university bookstore with the intention of purchasing a Calculus book and walked out disgusted by what they had on offer.

As a result, I've turned to the older books that presented Calculus in a different way before the modern educational trends started taking over. The challenge of writing a Calculus book is presenting concepts so that they can be used in applications, yet provide a foundation for deeper study in pure mathematics. I have one volume of a two-volume set in Calculus written by Tom Apostol, former math professor at Caltech. Apostol's book is different from so many other calculus books in that he covers integral calculus first before differential calculus, which takes some getting used to, but is an approach that he claims is historically and pedagogically sound. Richard Courant, who founded the Institute of Mathematical Sciences at NYU, wrote a three-volume set on Calculus that handles differential and integral calculus in the same order as Apostol. It's a challenging book, but if you master the concepts and the exercises you will be very well prepared for deeper mathematics. Michael Spivak's Calculus book, written in the late 60s/early 70s, is very much sought after. Spivak has a more conversational style than Courant and Apostol, but he is also very thorough (extremely thorough in fact; his books on Differential Geometry span at least six volumes!). None of these books are cheap; Apostol's two-volume set costs more than $200, Courant's three-volume set over $150, while Spivak's book costs $75 (the solutions manual, which I recommend, sells for $50). The book by Larson et al. also sells at triple-digits. Either book will be a valuable and consulted reference for the rest of your career, but it behooves you to think of how you will use the book before you purchase it.

If anyone else has a Calculus book that they particularly enjoyed and still keep as a reference, please mention it in the comments. I'll add other books as readers recommend them.

**POSTSCRIPT:** I forgot to mention whether I found the Calculus book I wanted. I decided to order Spivak's Calculus book a few days ago. I like his writing and I like the fact that he cares deeply about teaching mathematics in a rigorous and fundamental way. It was less expensive than the other options, so that helped my decision as well.

** Books**

There are many books on Bayesian statistical analysis, but fewer that serve as a good and comprehensive introduction to Bayesian statistics. Here's a list of some books that appear to fit that description:

- Bayesian Statistics: An Introduction, written by Peter Lee of the University of York (Lee's website has much more information on the book, including contents, problem sets, and computer codes).
- Introduction to Bayesian Statistics by William Bolstad, a senior lecturer at the University of Waikato in New Zealand. Bolstad has written a short paper on the challenges of exposing students to Bayesian statistics as opposed to classical (frequentist) statistics.
- A First Course in Bayesian Statistical Methods by Peter Hoff at the University of Washington. That book is actually available online from the Springer website, but you can only view the entire book if your institution subscribes to SpringerLink.
- Bayesian Data Analysis by Andrew Gelman, John Carlin, Hal Stern, and Donald Rubin, which appears to be a practical book that offers to give researchers the tools to apply Bayesian techniques to their data analysis.
- Bayesian Statistical Modelling by Peter Congdon. Another introductory book on the field.

I've read positive comments about the books and would love to own at least two, but I can only buy one at this time. I am leaning toward Lee's book because of the additional content of the problem sets and R computer codes. Hoff's book appears to be a good one as well and not extremely difficult to read.

**Online course notes**

There is a wealth of online notes and problem sets from university courses on Bayesian statistics. Most of the courses are not taught in the Statistics departments; some are taught by professors in the Public Health, Psychology, or Political Science departments, to give some examples. I think that's a good thing in that it demonstrates the power and applicability of Bayesian statistical analysis to problems in various fields.

- Applied Bayesian Statistics, from the Political Science department at University of Chicago
- Introduction to Bayesian Statistics, from the Stats department at University of Texas – Austin
- Bayesian Statistics for Engineers, from the ISyE department at Georgia Tech. Links to other Bayesian resources here.
- Statistics from Applications from MIT OpenCourseWare. The course is taught differently by various professors; some emphasize Bayesian statistics very heavily, and others cover it very minimally.
- A course on Bayesian Methods taught at Johns Hopkins University. Course notes are a little dry but very good.
- Peter Hoff's Bayesian Statistics course at University of Washington, whose material is drawn from Hoff's book. The problem sets, computer code, and datasets would be useful for anyone undertaking a self-study.
- Notes from a Bayesian Statistics short course that was taught at Université Paris Dauphine. They are very well-made but very technical and not for everyone.

**Survey publications**

There are tons of research papers on Bayesian statistical methods, but I want to highlight the tutorial papers that are available freely.

- An introductory paper on Bayesian Analysis from a course at the University of Arizona.
- A brief two-page paper in Nature Biotechnology that poses the question "What is Bayesian statistics?" and answers with a description of its main techniques, its difficulties, and its applications (focused on biotech).
- "Bayesian Statistics for Dummies", a good primer in HTML form.
- Another introduction from Mike Goddard at the University of Melbourne, with another installment of the Bayesian vs Frequentist debate.
- A concise introduction to Bayesian statistics from KP Murphy at University of British Columbia.
- Bayesian Statistics from José Bernardo at University of Valencia.
- A more advanced paper from Cooper and Herskovits on developing Bayesian networks from data.
- This guidance document from the US Food and Drug Administration focuses on best practices of Bayesian statistics for medical trials, but the information presented here can be useful for developing similar practices for sport statistics.

Of course, this is just a sample (pun not intended) of the many books and online material you can find on the subject — plenty of material that you can use for a self-study or a group study of Bayesian statistics. I hope to find more applications for it in the near future.

*[Parts of this post were re-written to better organize the lists, and to add links to books and course notes. This post may be updated in the future to account for any reference material that I might find.]*

The first is a software tool called CVXMOD. It is a Python-based tool for setting up and solving convex optimization problems, and was developed by yet another student in Professor Stephen Boyd's research group at Stanford. The software is based on another tool called CVX, which did the same thing and has been adopted widely throughout academia and industry. The difference is that while CVX was tied to Matlab, CVXMOD is open-source.

The second package is a tool called Sage. Its mission is to create "an open-source alternative to Magma, Maple, Mathematica, and Matlab", and it appears to be a really impressive job by tying together many open-source mathematical tools into a coherent interface. The result is a package that can handle problems from elementary to advanced mathematics, whether applied or pure. It was created by a math professor at Harvard who is my age, which is disturbing and depressing (at least to me…haha), and it's the kind of mathematical software that I longed to see ever since I started using Linux.

I have no idea right now if there is any applicability to what I want to do vis-á-vis soccer analytics, but I think mathematical software is really cool, so I wanted to share another one of my geeky passions. I'm sure I'll figure out some application later on.

]]>Several months ago, I wrote a few posts on various mathematical tools that might be of use to those doing more rigorous methods of soccer analytics. I came across a really nice tutorial paper on convex optimization written by Haitham Hindi, a scientist at PARC (Palo Alto Research Center, formerly owned by Xerox). It's like taking a course by Stephen Boyd (based heavily on the convex optimization book by Boyd and one of his former graduate students), but a lot less intimidating. It also helps that Hindi was one of Boyd's PhD students.

If you look around Hindi's professional website, you'll find another tutorial paper that completes the introduction to convex optimization.

I'm still not sure if there is any application to problems that I am investigating, but it's another tool to add to the toolbox. And besides, I think convex optimization is really cool.

]]>So here are a few documents that give a more detailed explanation of linear regression than I've seen so far. It's by no means an exhaustive summary; I'm sure you can find better ones on the web.

An Introduction to Regression Analysis, by Alan Sykes at the University of Chicago School of Law. Some really good explanation of the regressors and their meaning, which is useful when it comes to explaining the results. Also some explanation of goodness-of-fit and hypothesis testing.

Some fantastic notes on simple and multiple linear regression in an Applied Statistics course at MIT Sloan School of Management. As a matter of fact, the entire course is great.

Another excellent MIT Sloan course, this time on Data Mining. There are a couple of lecture notes on the use of multiple regression in data mining — a little advanced, but a solid description of how you might want to use a (relatively) simple technique to answer some sophisticated questions.

From my alma mater, course notes on multiple regression in an Applied Statistics course. The notes looked similar to the multiple regression notes in the MIT Data Mining course. And some more notes on simple regression and some diagnostic techniques.

Fantastic course notes on applied linear regression by Jamie DeCoster at the University of Alabama. It starts simply with some statistical review, and progresses into advanced diagnostic techniques such as outlier and multicollinearity analysis.

Like I said, there are tons of lecture notes out there, so you can find one that is best suited for you, but this list is enough to get you started. Now, go fill that toolbox!

]]>To this end, I am posting a link to a Convex Optimization course taught by Professor Stephen Boyd at Stanford University. Stanford has archived videos of the lectures from the 2007-08 Winter Quarter on their website, and you can also retrieve them at iTunes. Stephen Boyd is a legend in optimization research, and I regret not being able to have him for a class while I was a grad student at Stanford (I took his Linear Dynamical Systems course, but he was on sabattical at the time so a postdoc taught it). The material in his course has broad applications to many fields, and perhaps to sports analytics as well.

I should mention MIT's OpenCourseWare as well. They have course syllabi, notes, homeworks, and in some cases, video lectures on a majority of courses offered there. It is truly a treasure trove of knowledge. With respect to material that might be related to sports analytics, I would recommend that material in Mathematics and Business (the lecture notes on linear regression are awesome), but really, it's all good.

]]>