Tag Archives: hadoop

Hadoop example for Exim logs with Python.

The following is an example of parsing an exim_mainlog using Hadoop streaming. I’ve implemented both the mapper and the reducer in Python. The mapper and reducer don’t handle all of Exim’s log formats yet but this can be easily extended in the mapper and reducer if you actually end up using the output (this is [...]
Posted in Programming | Also tagged , , | Comments closed

Annotated List of Hadoop Tutorials

Official Tutorials The official Hadoop tutorial by Apache. Tutorial uses Hadoop classes with Java mappers and reducers to calculate word counts from several example books. Tutorial is very thorough and informative – good for first time Hadoop users to introduce all the components and ideas. The official Hadoop tutorial by Yahoo. Is essentially the same [...]
Posted in Programming | Also tagged , | Leave a comment