Machine Learning: Hands-On for Developers and Technical Professionals

Machine Learning: Hands-On for Developers and Technical Professionals

Jason Bell

Language: English

Pages: 408

ISBN: 1118889061

Format: PDF / Kindle (mobi) / ePub


Dig deep into the data with a hands-on guide to machine learning

Machine Learning: Hands-On for Developers and Technical Professionals provides hands-on instruction and fully-coded working examples for the most common machine learning techniques used by developers and technical professionals. The book contains a breakdown of each ML variant, explaining how it works and how it is used within certain industries, allowing readers to incorporate the presented techniques into their own work as they follow along. A core tenant of machine learning is a strong focus on data preparation, and a full exploration of the various types of learning algorithms illustrates how the proper tools can help any developer extract information and insights from existing data. The book includes a full complement of Instructor's Materials to facilitate use in the classroom, making this resource useful for students and as a professional reference.

At its core, machine learning is a mathematical, algorithm-based technology that forms the basis of historical data mining and modern big data science. Scientific analysis of big data requires a working knowledge of machine learning, which forms predictions based on known properties learned from training data. Machine Learning is an accessible, comprehensive guide for the non-mathematician, providing clear guidance that allows readers to:

  • Learn the languages of machine learning including Hadoop, Mahout, and Weka
  • Understand decision trees, Bayesian networks, and artificial neural networks
  • Implement Association Rule, Real Time, and Batch learning
  • Develop a strategic plan for safe, effective, and efficient machine learning

By learning to construct a system that can learn from data, readers can increase their utility across industries. Machine learning sits at the core of deep dive data analysis and visualization, which is increasingly in demand as companies discover the goldmine hiding in their existing data. For the tech professional involved in data science, Machine Learning: Hands-On for Developers and Technical Professionals provides the skills and techniques required to dig deeper.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The command-line prompt is a simple greater than sign (>). You can perform calculations on the command line, so adding numbers together is a trivial process, like so: > 1+2 [1] 3 > To get proper use from R, though, you need to think a little more programmatically. Variables and Vectors R supports variables as you would expect. To assign them, you can either use the equal sign (=) or the less than sign and a hyphen together (<-): > myage = 21 > myageagain <- 21 > myage [1] 21 > myageagain [1]

Copy the data you want to process to the HDFS filesystem: hadoop fs -put mydata.txt mydata.txt Run the Hadoop job. There are two parameters: the input file or directory in HDFS and the directory to output the results (to ensure it doesn't already exist on HDFS): hadoop jar /path/to/hadoop-*-examples.jar wordcount mydata.txt output Hadoop then processes the job. To see the results, you need to copy the result data from the HDFS filesystem back to your local filesystem. hadoop fs --getmerge

lines of text; they are needed to write the file (:w) or quit the application (:q). If you want to quit without saving, then :q! is needed. Likewise, overwriting an existing file uses :w! There is an “insert mode” while you are entering new text, and there is an “edit mode” for moving and changing text. Vi is so simple there is a coffee mug available with all (100%) of the vi commands on it. It takes time to learn the commands, how to delete lines, yank text into “buffers” then paste it, how to

tweet) { int score = 0; StringTokenizer st = new StringTokenizer(cleanTweet(tweet)); String thisToken; while(st.hasMoreTokens()){ thisToken = st.nextToken(); if(poswords.contains(thisToken)) { score = score + 1; } else if(negwords.contains(thisToken)){ score = score - 1; } } return score; } } When the class is first requested within a stream definition, Spring XD loads the class and loads the positive and negative word lexicons, storing each in its respective HashSet; this is handled by

coming in for processing. Creating Your First Stream with Scala StreamingContext replaces the standard SparkContext as the main entry point. The code listing is pretty simple; it listens to the raw socket on port 9898 on localhost and then does a quick word count on the data coming in. import org.apache.spark.SparkConf import org.apache.spark.streaming.{Seconds, StreamingContext} import org.apache.spark.streaming.StreamingContext._ import org.apache.spark.storage.StorageLevel object

Download sample

Download