简介: 今天把cmder的配色研究下,找了个nice的shell版,赞一个,另外,英文是绕不过去的,最后总会发现想要的资料只有英文的。 :)
1、这几天抽空看了下《利用python进行数据分析》的几个章节,其中pandas部分看了2遍,熟悉了一些命令和用法,
2、国外一个朋友的对某个问题的建议是使用JS, 首先我是按照这个网页的心得:不过我找了一本JS的入门书,发现内容不是兴趣所在,所以暂时先记录之。
http://kb.cnblogs.com/page/191787/
3、另搜资料的时候,找了一个国外的网址,介绍了数据分析几本不错的书,
Must have books for data scientists (or aspiring ones)
4、在上述链接中,是6本书的英文介绍,貌似R的数量更多,并且有一本是R的,但有第三方给出了Python代码,(注意有的文字因粘贴丢失了超链接):
1. R Cookbook by Paul Teetor
This is simply the best book to start your journey with R. It contains tons of examples and practical advice on a wide range of topics like file input / output, data manipulations, merging and sorting to building a regression model. For a starter in R, this book becomes your best pal during the initial testing time.
While the book is aimed towards starters, it still remains a prominent feature of the library of any data scientist.
2. Machine Learning for Hackers by Drew Conway & John Myles White
I think this book actually has a wrong title. I dropped purchasing it twice before giving it a shot (which happened only because of a recommendation from a close friend). This book is meant for data scientists and not hackers. I don’t know why the title says so. A very practical manual for learning machine learning, it comes with good visuals and you can get a copy of codes in Python (original book is based on R).
3. R graphics cookbook by Winston Chang
You can’t be a good data scientist unless you master the graphics in R! There is no better way for visualization, but to learn ggplot2. Sadly, learning ggplot2 might seem like learning a completely new language in itself. This is where this “cookbook” comes to rescue. The recipes from Winston are short, sweet and to the point. Buy this and it is bound to end up as one of the most referred book in your library.
4. Programming Collective Intelligence by Toby Segaran (popularly referred as PCI)
If there is one book you want to choose, out of this selection (for learning machine learning) – it is this one. I haven’t met a data scientist yet who has read this book and does not recommend to keep it on your bookshelf. A lot of them have re-read this book multiple times. The book was written long before data science and machine learning acquired the cult status they have today – but the topics and chapters are entirely relevant even today! Some of the topics covered in the book are collaborative filtering techniques, search engine features, Bayesian filtering and Support vector machines. If you don’t have a copy of this book – order it as soon as you finish reading this article! The book uses Python to deliver machine learning in a fascinating manner.
5. Python for Data Analysis by Wes McKinney
Written by Wes McKinney, this book teaches you everything you need about Pandas. For the starters (not sure why you are still reading this article), pandas are Python’s way to handle data structures. Except for the title of the book (which I find misleading), I like everything else about this book. It contains ample codes and examples to leave you capable of performing any operation / transformation on a dataframe in Python (using pandas).
For the advanced users, if you already know pandas, you should look at this presentation from Wes on what are the shortcomings of pandas.
6. Agile data science by Russell Jurney
A recent addition by O’Reilly, this book looks like a must read for data scientists. The focus is on using “light” tools, which are easy to use and still get the work done. This is currently on my reading list and I’ll update more details once I have read it.
These are the 6 must have books, if you are serious about being a data scientist. There are a couple of additional Python books, which you can consider – Natural Language processing with Python by Steven Bird et al and Mining the social web by Matthew A. Russell. The reason I have not kept them in the list is because you can find a lot of the information in these books easily on the web.
5、另外,还有2篇不错的英文的基于python-pandas的数据分析教程:
A Complete Tutorial to Learn Data Science with Python from Scratch
以及这个:
Data Munging in Python (using Pandas) – Baby steps in Python