Search code examples
Hadoop UniqValueCount Map and Aggregate Reducer for Large Dataset (1 billion records)...


hadoopmapreducehadoop-streamingelastic-map-reduce

Read More
Hive / Map-Reduce Job on a Hadoop cluster: How to (roughly) calculate the diskspace needed?...


hadoopmapreducehivehdfselastic-map-reduce

Read More
Hadoop Pig save each line of a file to S3...


hadoopamazon-s3apache-pigelastic-map-reduceamazon-emr

Read More
Downloading files from FTP to local using Java makes the file unreadable - encoding issues...


javahadoopftpelastic-map-reduceamazon-emr

Read More
Reading large files using mapreduce in hadoop...


javahadoopmapreduceelastic-map-reduceamazon-emr

Read More
How to specify mapred configurations & java options with custom jar in CLI using Amazon's EM...


javahadoopmapreduceelastic-map-reduceemr

Read More
Too many open files in EMR...


hadoopmapreduceelastic-map-reduceemr

Read More
Best way to have a fast access key-value storage for huge dataset (5 GB)...


javahadoopmapreduceelastic-map-reduceemr

Read More
How do you use Python UDFs with Pig in Elastic MapReduce?...


apache-pigelastic-map-reduce

Read More
Producing ngram frequencies for a large dataset...


postgresqlhadoopmapreducebigdataelastic-map-reduce

Read More
What ports does Apache Hadoop version 1.0.3 use for intracluster communicaion of the daemons...


hadoopmapreducehbaserhelelastic-map-reduce

Read More
Loading data with Hive, S3, EMR, and Recover Partitions...


hadoopamazon-s3amazon-web-serviceshiveelastic-map-reduce

Read More
How to decide on number of parallel mapers/reducers along with Heap memory?...


hadoopmapreduceelastic-map-reduceemr

Read More
Easiest way to get started with Hadoop...


hadoopelastic-map-reduce

Read More
Can I access zookeeper from AWS Elastic Mapreduce job...


hadoopamazon-web-servicesapache-zookeeperelastic-map-reduceemr

Read More
When using LZO on Hadoop output on AWS EMR, does it index the files (stored on S3) for future automa...


amazon-s3amazon-web-serviceselastic-map-reducelzo

Read More
Performance Impact on Elastic Map reduce for Scale Up vs Scale Out scenario's...


amazon-web-servicesmapreduceelastic-map-reduce

Read More
Problems using distcp and s3distcp with my EMR job that outputs to HDFS...


amazon-web-serviceselastic-map-reduceamazon-emremr

Read More
How do I pass the Hadoop Streaming -file flag to Amazon ElasticMapreduce?...


elastic-map-reducehadoop-streaming

Read More
Elastic MapReduce fails with: 1: Syntax error: "(" unexpected...


elastic-map-reduce

Read More
How can I share jar libraries with amazon elastic mapreduce?...


hadoopamazon-ec2elastic-map-reduce

Read More
Setting hadoop parameters with boto?...


pythonbotoelastic-map-reduce

Read More
Can you programmatically control Elastic Mapreduce jobs easily?...


rubyhadoopelastic-map-reduceamazon-emr

Read More
Join performance on AWS elastic map reduce running hive...


amazon-ec2hivehdfselastic-map-reduce

Read More
Interface as Mapper value output...


javainterfacehadoopmapreduceelastic-map-reduce

Read More
AWS Elastic Map Reduce: output to SimpleDB...


hadoopamazon-simpledbelastic-map-reduce

Read More
Amazon EMR: Configuring storage on data nodes...


hadoopamazon-ec2amazon-web-serviceselastic-map-reduceemr

Read More
Force one reducer in AWS EMR...


amazon-web-serviceselastic-map-reduce

Read More
Hadoop seems to modify my key object during an iteration over values of a given reduce call ...


hadoopreduceelastic-map-reduce

Read More
boto ElasticMapReduce throttling and rate limiting...


amazon-ec2throttlingbotorate-limitingelastic-map-reduce

Read More
BackNext