9 thoughts on “About

  1. dear sir,
    i wonder if you repeated the analysis on an english translation of the first 5 books of the christian bible (the Torah isn’t translated, is it?). the nice thing there is that there are cifferent translations. king james, new standard american, whatever… presumably, the translations should give the same classifications. i wonder if that would give your analysis more legitamcy. i guess the translations are like adding noise to the undelying text. if tne results are the same with the addition of noise, then i would think that provides some measure of confidence?
    best regards.

    • Hi there,

      I wish it were so easy that I could just analyze the English translations, but if I were to do that then the results would probably say a lot more about the person doing the translating than the original text.

  2. Your analysis on the authorship of the Hebrew Bible is interesting. I can help you with your questions, but it will be more suitable to an email exchange rather than posting a brief comment.

  3. Hi, I’m just curious about the size of the datasets you typically work with? I found this site when searching for help on using the ff package. I regularly use SAS analyze credit risk data on datasets consisting of millions of records and thousands of variables. The data I analyze are also of mixed modes with a roughly 50/50 split between character and numeric.

    Is this kind of data similar to what you analyze? I have been struggling to find a consistent workflow for analyzing large datasets (not “big” data) using open-source tools like R. I need to read in csv files too big to fit in memory, explore the data, create new column vectors, and store it all on disk somehow. Would you recommend the ff package as a solution that can handle this?

    Cheers, and thanks for the interesting blog posts!

    • Hi there,

      I can’t say I’ve had to work with data sets containing millions of records and thousands of variables, but I can say that ff will definitely be able to handle it. Like SAS, ff allows you to analyze data without loading all of it into memory at once. I would for sure recommend using ff to solve your work needs. Just remember that when working with ff, everything needs to be treated accordingly. When making new vectors, they always seem to need to be declared (e.g. “as.ff(ifelse(x[,”something”] == “blah”,1,0)) ) and not every R function works the same when your data is in ff form.

      Good luck!

  4. Hi, I’m a community blog curator for DZone. I wanted to talk with you about potentially featuring your blog on DZone’s content portals. If you’re interested, please send me an email at alecn@DZone.com and I’ll explain the details.

    Thank you!

  5. Hi. It’s Craig Offman, the Globe reporter you spoke to last fall about ebikes. I’d like to follow up with you. Are you free in the next day or two for a chat? Much appreciated.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s