Passwords in R

The problem:

I constantly query databases with R (using RMySQL) which requires my passwords to be explicitly written in my code. Due to the sensitive nature of the data, this isn’t going to work. When working locally, this isn’t so much if a problem as long as I log out when I’m away from my computer and hope no one is looking over my shoulder when I’m not. It is still fairly insecure, and I’m not sure the people over at the IRB or HIPAA would look too fondly at that, but no one is getting fined. And I can sleep at night. Most of the time.

It seems to me that the two threats to the security of my data are:

  1. People looking over my shoulder, writing down my password, stealing my data, and then publishing numerous brilliant papers regarding early Autism detection with the data I’m supposed to safeguard.
  2. People hacking my hard drive, stealing my password, and then publishing numerous brilliant papers regarding early Autism detection with the data I’m supposed to safeguard.

Luckily for me, I am less worried about #2 (hackers) than I am #1 (onlookers) for the soul reason that a frightening amount of researchers I have dealt with in the Autism early detection world (which is substantial, given that they are pretty much all in the consortium) enter data into excel and analyze data with SPSS. Maybe they use REDCap if they are really feeling high-tech.

An aside: I met one researcher in the consortium who uses R, but I found out that rather than writing out TRUEor FALSE, he uses T or F, which although technically possible, is disgusting and he should be suspended from school and given time to think about his actions and how they affect others. Although I’m not sure they do that kind of thing at Yale.

The solution:

There is none. I don’t think.

The workaround:

Prying eyes

For prying eyes, I pulled from the examples section of the help file for chartr. It is a poor mans cryptography of sorts. You can run your password through it once with the argument rv=FALSEto obfuscate it and then run it again with rv=TRUE to convert it back to your original password.

In use, you would find your obfuscated password in the terminal and in the password argument of whatever function you are using, you would pass obfus('your_obfuscated_pasword'). Take THAT, people behind me.

Use within a function that requires a password then looks something like:

Pros:

  • People behind you can’t see your password
  • It will work via source and R CMD BATCH.
  • that’s it.

Cons:

  • Your password is so insecure it’s ridiculous.

Hackers

Alternately don’t hard code your password. Use readline so that R prompts for your password, assign the result of readline to an object and use that object in your password argument. Clear the console (ctrl + L) after entering your password and then remove the password object from your workspace.

cat('\14')is equivalent to ctrl+L—use that if you’re too lazy to actually press ctrl+L.

Pros:

  • People behind you can’t read your password unless they look at your console sometime between you typing in your password and you clearing the console.
  • People can’t read your password if they make off with your script file.

Cons:

  • Won’t work via source or R CMD BATCH.
  • It is annoying to have to type in your password regularly.

Conclusion:

I am unaware of a perfect solution to this problem. If someone knows something that I don’t, I would love to hear it. The two options provided above are far from perfect, but they are the best I can come up with. I have heard of options using Tcl/Tk, but when dialogue boxes pop up in the middle of working on something (even when I expect them), it makes me want to punch through my computer screen (not that I would—I’d prefer to not fracture my wrist again), so I avoid those workarounds at all costs.

Leave a Reply