Most upvoted comment
Top rated programming books on Reddit rank no. 29
The Problem with Hacker Schools(r/programming)
It seems that people have a misconception about what a 4-year CS program in a decent university can provide. I actually learned a lot in my CS program, and what I learned is still tremendously valuable to my daily job. Just to name a few:
Data structures and algorithms
I shit you not when I say this course carries me a long way. Yes, I don’t have to implement a Skip List or a OBDD as I did in school any more, but I still need to implement augmented data structures on graphs and trees, and analyze their complexity in theory and in test. And yes, I learned augmentation from my algorithm course. What’s more, I still need to read papers or books on algorithms that I’m not familiar with. The training I received in school enables me to read such literature. And did I mention dynamic programming shows up again and again in our production code?
I can’t imagine someone would say this course does not prepare you for future work. This course is a life changer! We implemented an in-memory, single-threaded database that supports SQL semantics in Lisp and Prolog. Each project was jus a few thousands lines of code, but they really drove home the fundamentals of functional programming and logic programming. The result? I could appreciate functional programming since day 1 in my job. Heck, I actually implemented a combinatoric parser for an internal DSL when I started as a programmer. Oh, and all kinds of scoping rules, pass-by-this-and-that semantics? Piece of cake after this course.
We wrote tons of code. A lexer, a parser, a code generator. A static analyzer. We learned a lot too about parsing theory, static analysis, and “exotic” things that European dudes love to talk about (Check out this book if you will: www.amazon.com/Principles-Program-Analysis-Flemmin…. We used it for a reference). Are they any useful besides allowing us to practice coding and organizing complicated software? Hell yeah! Trust me, sooner or later you’ll need to write a parser or understand code generation. More importantly, the course allows me to peek inside how compiler/interpreter/virtual machine works.
We learned building a RISC CPU with preliminary pipelines. The knowledge about pipeline and micro parallelism is still relevant when it comes to optimizing our code.
Implementing all kinds of search and planning algorithms using Prolog. True, I don’t have to code any AI project in my job, but really, the deep understanding of searching and the algorithmic techniques used in planning are still very useful for projects that involves resource allocation and optimization.
We implemented scheduling algorithms, memory management algorithms, and file system code. Actually, the professor was too heavy on theories so we didn’t get to hack a kernel. Even so, what I learned was still valuable. To make a large system run faster, we have to get to the OS level from time to time. Let alone all the concepts I learned and am still using, such as cache coherence, locality, virtual memory, and etc.
It’s just a weird name for B-Tree, Hashing, and Inverted Index. We learned a different computation model that involves disks (or slower IO in general), and we implemented a fully functional B-Tree, inverted index, and hashing in order to implement a simplified database from scratch. That’s thousands and thousands lines of code. Lots of fun, lots of lessons to be learned. Is it helpful? You bet! After this course (it’s in my second year, BTW), it just became natural to analyze large systems that involve IO. It also helps me understand things like log structure file systems, or search engine internals .
The project was nothing. Just a library management system using embedded SQL. Yes, you can say that is outdated. However, what really matters is we learn a ton of relational algebra, relational calculus, and transactional processing (yes, many many hours of proofs). Case in point: I was interviewing a few years ago with a startup. By that time, I hadn’t done SQL programming for a long time, and they asked me problems that involve multiple joins and aggregations and subqueries. And no matter, they are just combinations of fundamental operations, and I nailed the interview easily. And when I had to wrote my own version of MVCC, I knew what do do.
Should I say any more? Building distributed systems is hot in the valley after all. When I interviewed for a startup after wasting my time as a J2EE programmer for a few years, I had no problem answering their interview questions, or implementing a distributed system later in the job. This course grilled me with so many concepts that I have to deal with on a daily basis. I just can’t appreciate it enough.
I didn’t think I would need this course later, but hey, life moves in circles. The tools I learned in the course, Matlab and Maple in particular, gave me an edge. No, I’m not using these two tools any more, but I had no problem picking up Octave, R, and IPython. They made my job a lot easier when I was experimenting some algorithms to forecast on time series.
By the way, that nifty interval arithmetic we had to struggle to implement in our first numeric analysis assignment? Hey, Google uses it solve the clock skew problem in its Spanner system. Who would know, huh?
All the math and statistics
Again, in the world of building large, distributed, and optimized systems. The more of math and stats you learn, the better. Besides, it prepares me to read academic papers, which proves invaluable for us to find solutions to our problems. Given that machine learning is becoming red hot in the valley, it surely doesn’t hurt to know more about math or stats, does it? Interestingly, even calculus is helpful. I’m not saying manually calculating integrals or solving differential equations — software can surely do better than me. I’m talking about modeling our problems with the help of Calculus. An example is the Gossip Protocol. Its analysis is based on math model for epidemiology, which we happened to learn in our mathematical modeling course. In fact, we actually modeled one of our graph sampling problem as if it is continuous, which greatly simplifies our analysis.
Math is important to distributed systems programming too. It turned out streaming algorithms deals with a lot of math, at least for now. And to make systems distributed easier, we also tap into abstract algebra. Check out Twitter’s algebird, for example.
Still useful. State machines, anyone? It’s the cornerstone of distributed systems. Automaton, anyone? Regex comes right out of it. Computability and complexity. Well, I can’t say they directly helped me in my job. They did help me in my interview, though. For some reason, some startups liked to ask about NP questions. They usually pretended that there was a non-brute-force solution to a problem and asked me to solve it. I did help a co-worker once with the knowledge of polynomial reduction, too. He described his problem, and I showed him that the problem can be reduced to the partition problem, so he didn’t waste time finding a polynomial solution.
The list can still go on. The bottom line is, my university did not necessarily teaches me trendy technologies, but they did prepare me for my future job. Technologies come and go all the time, but the fundamentals that my professors drilled in me guide me for a long time.
Permalink : /r/programming/comments/267s49/the_problem_with_hacker_schools/