Final: Question 1

Please download the Enron email dataset, unzip it and then restore it using mongorestore. It should restore to a collection called “messages” in a database called “enron”. Note that this is an abbreviated version of the full corpus. There should be 120,477 documents after restore.

Inspect a few of the documents to get a basic understanding of the structure. Enron was an American corporation that engaged in widespread accounting fraud and subsequently failed.

In this dataset, each document is an email message. Like all Email messages, there is one sender but there can be multiple recipients.

Construct a query to calculate the number of messages sent by Andrew Fastow, CFO, to Jeff Skilling, the president. Andrew Fastow’s email address was Jeff Skilling’s email was

For reference, the number of email messages from Andrew Fastow to John Lavorato ( was 1.Solution: 3  

  • Download the and extract it
  • Now run “mongod”
  • Now import the extracted database mongorestore –drop –db enron dump/enron
  • Now run “mongo”
  • See if Enron database is imported with “show databases”
  • In the list, if you see enron then you are ready to go.
  • Now run “use enron”
  • Then “show collections”
  • You will see messages collection
  • Check your data with “db.messages.findOne()” { “_id” : ObjectId(“4f16fc97d1e2d32371003f02”), “body” : “COURTYARD\n\nMESQUITE\n2300 HWY 67\nMESQUITE, TX 75150\ntel: 972-681-3300\nfax: 972-681-3324\n\nHotel Information: \n\n\nARRIVAL CONFIRMATION:\n Confirmation Number:84029698\nGuests in Room: 2\nNAME: MR ERIC BASS \nGuest Phone: 7138530977\nNumber of Rooms:1\nArrive: Oct 6 2 001\nDepart: Oct 7 2001\nRoom Type: ROOM – QUALITY\nGuarantee Method:\n Credit card guarantee\nCANCELLATION PERMITTED-BEFORE 1800 DAY OF ARRIVAL\n\nRATE INFORMA TION:\nRate(s) Quoted in: US DOLLAR\nArrival Date: Oct 6 2001\nRoom Rate: 62.10 per night. Plus tax when applicable\nRate Program: AAA AMERICAN AUTO ASSN\n\nSP ECIAL REQUEST:\n NON-SMOKING ROOM, GUARANTEED\n \n\n\nPLEASE DO NOT REPLY TO THIS EMAIL \nAny Inquiries Please call 1-800-321-2211 or your local\ninternationa l toll free number.\n \nConfirmation Sent: Mon Jul 30 18:19:39 2001\n\nLegal Disclaimer:\nThis confirmation notice has been transmitted to you by electronic\nma il for your convenience. Marriott’s record of this confirmation\nnotice is the official record of this reservation. Subsequent\nalterations to this electronic m essage after its transmission\nwill be disregarded.\n\nMarriott is pleased to announce that High Speed Internet Access is\nbeing rolled out in all Marriott hote l brands around the world.\nTo learn more or to find out whether your hotel has the service\navailable, please visit\n\nEarn points toward free va cations, or frequent flyer miles\nfor every stay you make! Just provide your Marriott Rewards\nmembership number at check in. Not yet a member? Join for free at\n\n\n”, “filename” : “2.”, “headers” : { “Content-Transfer-Encoding” : “7bit”, “Content-Type” : “text/plain; charset=us-ascii”, “Date” : ISODate(“2001-07-30T22:19:40Z”), “From” : “”, “Message-ID” : “<32788362.1075840323896.JavaMail.evans@thyme>”, “Mime-Version” : “1.0”, “Subject” : “84029698 Marriott Reservation Confirmation Number”, “To” : [ “” ], “X-FileName” : “eric bass 6-25-02.PST”, “X-Folder” : “\\ExMerge – Bass, Eric\\Personal”, “X-From” : “”, “X-Origin” : “BASS-E”, “X-To” : “EBASS@ENRON.COM”, “X-bcc” : “”, “X-cc” : “” }, “mailbox” : “bass-e”, “subFolder” : “personal” }
  • Now run following query to get the output
    • db.messages.find({“headers.From”:””,”headers.To”:””}).count()
  • Confirm you the result, I got “3” as an answer.

Previous articleThe solution Homework 2.3 M101JS: MongoDB for Node.js Developers
Next articleThe solution Homework 3.1 M101JS: MongoDB for Node.js Developers



Please enter your comment!
Please enter your name here