At the department we have been analyzing some transaction data for some time. We got a new dataset with lots of transactions. Once you need interpurchase (IPT) times, posix is quite useful, as you can easily difference transactions to generate IPTs.
So the data is on a daily basis. However, some IPTs appeared to be decimal:
30.04167 days?? Well, of course R is right. So what's the reason? Marco (thx!) quickly gave me hint: daylight savings time. In my timezone there is daylight savings time. By defaukt strptime() uses the current time zone. Switching to UTC solved the issue.
joint posterior
MCMC & R
Mittwoch, 27. April 2011
Mittwoch, 13. April 2011
compiler and runiregGibbs (bayesm)
So everyone's excited about the new R 2.13 release because of the compiler package.
Apparently it is easy to get a 3x speed increase by simply compiling a function.
Doing a lot of the MCMC stuff, I am particularly excited about speed in R. I just compiled a 2000-line file with code from my latest project, but none of the functions would run faster. Apparently I need to break down things a little more and use more subfunctions.
Well, so I tried a much easier example. I took runiregGibbs from the well known bayesm packages (which is a function completely written in R) and compiled it. There's a visible change, but it's quite small:
Apparently it is easy to get a 3x speed increase by simply compiling a function.
Doing a lot of the MCMC stuff, I am particularly excited about speed in R. I just compiled a 2000-line file with code from my latest project, but none of the functions would run faster. Apparently I need to break down things a little more and use more subfunctions.
Well, so I tried a much easier example. I took runiregGibbs from the well known bayesm packages (which is a function completely written in R) and compiled it. There's a visible change, but it's quite small:
Freitag, 21. Januar 2011
Disable auto-update from R (Windows)
There are two major threats to complex MCMC estimations:
I thought about the latter threat. At times, you may hand some R code to other co-workers, especially when you need to estimate lots of different models in a short period of time. If they have auto-updates enabled, your results are quickly in jeopardy. While automatically saving the workspace [if(rep==10000) save.image("save.RData")] is definitely a very good idea, deactivating automatic updates on windows is still important for long-term calculations.
Therefore I thought about ways to deactivate automatic updates from within R. A major obstacle is UAC (user account control) which turns your administrator account into something ... with less right. Windows frequently asks for privilege elevation. Sadly, elevation does not work without further administrative tools, which you will not find on every PC around. Therefore I decided to use the "runas" command, which will ask the user for her password once.
Although this is not a perfect solution yet, it is also a very nice example on some interesting functions: paste, sub, system.
[EDIT]
"runas" apparently does not work with "net stop" command, therefore a .bat file needs to be created which can be started using runas. The .bat is created from within R.
[EDIT2]
Random John reminded my that an option to turn auto update back on would be nice. Well, I added that.
- Wrong energy settings (hibernate after 2 hours of inactivity)
- Automatic Updates (install updates at 3 a.m.)
I thought about the latter threat. At times, you may hand some R code to other co-workers, especially when you need to estimate lots of different models in a short period of time. If they have auto-updates enabled, your results are quickly in jeopardy. While automatically saving the workspace [if(rep==10000) save.image("save.RData")] is definitely a very good idea, deactivating automatic updates on windows is still important for long-term calculations.
Therefore I thought about ways to deactivate automatic updates from within R. A major obstacle is UAC (user account control) which turns your administrator account into something ... with less right. Windows frequently asks for privilege elevation. Sadly, elevation does not work without further administrative tools, which you will not find on every PC around. Therefore I decided to use the "runas" command, which will ask the user for her password once.
Although this is not a perfect solution yet, it is also a very nice example on some interesting functions: paste, sub, system.
[EDIT]
"runas" apparently does not work with "net stop" command, therefore a .bat file needs to be created which can be started using runas. The .bat is created from within R.
[EDIT2]
Random John reminded my that an option to turn auto update back on would be nice. Well, I added that.
Samstag, 15. Januar 2011
Quickly adapt starting values in MCMC using paste()
Waiting for convergence of MCMC models can take some time, therefore it may be a good idea to use better starting values. Using paste, one can quickly convert any (parameter) vector in the workspace into a R-style vector (with c()).
Here's a function that takes any vector (for example 1:10) and turns it into c(1,2,3,4,5,6,7,8,9,10). Now it's easy to quickly adapt starting values in your code.
[update 1]
Well, of course we can do the same with matrices.
Here's a function that takes any vector (for example 1:10) and turns it into c(1,2,3,4,5,6,7,8,9,10). Now it's easy to quickly adapt starting values in your code.
[update 1]
Well, of course we can do the same with matrices.
Dienstag, 11. Januar 2011
table() in R
The table function in R is very useful, especially when working with survey data. Often you may have Likert scales for levels of agreement or satisfaction. table() quickly gives the distribution of answers, which can then be used for (bar)plots.
However, especially with large scales (10 point scale) some answers may stay unused. Therefore I thought it would be nice to have a table function which returns "0" for unused categories. I tried to implement this in the table2() function. I works just like table(), there is only one additional argument "classs" which is the scale (1:10 for a 10-point Likert scale).
However, especially with large scales (10 point scale) some answers may stay unused. Therefore I thought it would be nice to have a table function which returns "0" for unused categories. I tried to implement this in the table2() function. I works just like table(), there is only one additional argument "classs" which is the scale (1:10 for a 10-point Likert scale).
User Account Control (Windows)
When using Windows Vista / 7, Windows' User Account Control can be annoying. When is R going to be Windows 7 ready, asking for elevated priviliges only when needed? I guess the problem lies in the structure of packages.
Running Rgui.exe "as Administrator" by right-clicking is one option to get around these problems. However, when installing Tinn-R and linking it to R, I haven't found a solution without turning off UAC completely during the installation - which is again annoying, because switching UAC needs to reboot Windows.
Running Rgui.exe "as Administrator" by right-clicking is one option to get around these problems. However, when installing Tinn-R and linking it to R, I haven't found a solution without turning off UAC completely during the installation - which is again annoying, because switching UAC needs to reboot Windows.
Montag, 10. Januar 2011
Install R Packages wherever needed
I frequently occupy computers everywhere with extensive MCMC tasks. Installing R doesn't take long, but it can be very annoying if you manually have to install dozens of R packages before your code is able to run. Well, now I use the following command to load my packages:
This way I don't have to worry about installing packages.
This way I don't have to worry about installing packages.
Abonnieren
Posts (Atom)