Since you like mathematics I highly highly recommend Murphy’s Probabilistic Machine Learn: An Introduction and Advance Topics. Pre-print versions are freely available from his website. The books cover a wide variety of ML topics and should provide a great foundation. I also love how thorough his references are and feel like that alone is enough to justify the price of the book. My only caution is the book is poorly edited in certain areas where some formulas are incorrect (but you’ll probably catch the errors and they aren’t significant) and a paragraph is randomly missing in a section or two. But it’s an amazing thorough book and will definitely set a solid foundation as it doesn’t shy away from explaining the underlying details like others do.
If you’re looking for a practical book to go with it, Heron’s Hands on ML book is pretty decent as it walks you through the general framework of ML work. Honestly, with that said, the documentation of SciKit is awesome and can get you going pretty quick along with a few tutorials.
An introduction to statistical learning is a pretty decent primer on the subject if you’re looking for a good middle ground between theory and practice. The examples are in R which may be a negative for some. However, if you’re looking for a more math focused book with a similar feel, some of the authors were involved in writing the elements of statistical learning which follows the same structure but goes deeper into the topics and includes more advanced topics.
Also depending on how familiar you are with optimization it doesn’t hurt doing a little reading on that topic by itself. Murphy’s boom provides a decent crash course on the subject but there are plenty of other great books on the subject. I’ve found the work of Boyd to be great in that area but I can’t remember a primer of his to recommend.
Finally, one area I think worth dedicating its own book to is the concept of kernel methods used algorithms like support vector machines and Kernel PCA. Scholkopf’s Learning with Kernels is pretty great at introducing the topic and explaining how broad their applicability is.
Hope that helps!