STA 250 :: Advanced Statistical Computing (UCD, Fall 2013)

Code + goodies used in Prof. Baines' STA 250 Course (UC Davis, Fall 2013)

Project maintained by STA250 Hosted on GitHub Pages — Theme by mattgraham

STA 250 :: Homework 00

For all questions you must show your work. This enables us to understand your thought process, give partial credit and prevent crude cheating. Please see the code of the conduct in the Syllabus for rules about collaborating on homeworks.

For questions requiring computing, if you use R, python or any programming environment then you must turn in a printout of your output with your solutions.
In addition, a copy of your code must be uploaded to your HW0 directory as per Q6 below.

Homework 0 (No Credit -- Practice Only)

Due: In Class, 5:30pm Wed October 9th

Assigned: Wednesday Oct 2nd

Some basic coding problems to get you back in the swing.

Write a program that prints the numbers from 1 to 100. But for multiples of three print "Fizz" instead of the number and for the multiples of five print "Buzz". For numbers which are multiples of both three and five print "FizzBuzz".
(From: http://www.codinghorror.com/blog/2007/02/why-cant-programmers-program.html)
Write a program that generates 10,000 uniform random numbers between 0 and \(2\pi\) (call this \(x\)), and 10,000 uniform random numbers between 0 and 1 (call this \(y\)). You will then have 10,000 pairs of random numbers. Transform \((x,y)\) to \((u,v)\) where:
\[\begin{aligned} u & = y\cdot\cos(x) \\ v & = y\cdot\sin(x) . \end{aligned}\]
Make a 2D scatterplot of the 10,000 \((u,v)\) pairs.

What is the distribution of: \(r=\sqrt{u^{2}+v^{2}}\)
Consider the following snippet:
```
Hello, my name is Bob. I am a statistician. I like statistics very much.
```
a. Write a program to spit out every character in the snippet to a separate file (i.e., file out_01.txt would contain the character H, file out_02.txt would contain e etc.). Note that the ,, . and spaces should also get their own files.

b. Write a program to combine all files back together into a single file that contains the original sentence. Take care to respect whitespace and punctuation!
Run boot_camp_demo.py as a batch job on Gauss using the submission script boot_camp_sarray.sh in the Github repo. Follow the instructions in class for how to do this.
Run the Twitter code provided in lecture. Make sure to run the tweet-grabbing portion of code for a sufficient length of time (It is recommended to open another terminal and run ls -alh to check the size of the output file). The README provides full instructions for each of the steps.

See how your plot differs from the one shown in lecture 01
Modify the code to report the percentage of tweets that had geo-tagged data at the end of the sentiment analysis.

Consider the autoregressive process of order 1, usually called an AR(1) process:
\[ y_{t} = \rho{}y_{t-1} + \epsilon_{t} , \quad \epsilon_{t}\sim{}N(0,1) , \]
for \(t=1,2,…,n\). Let \(y_{0}=0\) and the \(\epsilon_{t}\) be independent.

a. Simulate from this process with \(\rho=0.9\) and \(n=1000\). Plot the resulting series.

b. Repeat part (a) 200 times, storing the result in a \(1000\times{}200\) matrix. Each column should correspond to a realization of the random process.

c. Compute the mean of the 200 realizations at each time points \(t=1,2,\ldots,1000\).
Plot the means.

d. Plot the variance of the 200 realizations at each time points \(t=1,2,\ldots,1000\).
Plot the variances.

e. Compute the mean of each of the 200 series across time points \(i=1,2,\ldots,200\).
Plot the means.

f. Compute the variance of each of the 200 series across time points \(i=1,2,\ldots,200\).
Plot the variances.

g. Justify the results you have seen in parts b.--f. theoretically.
a. Let \(Z\sim{}N(0,1)\). Compute \(\mathbb{E}\left[\exp^{-Z^{2}}\right]\) using Monte Carlo integration.

b. Let \(Z\sim{}\textrm{Truncated-Normal}(0,1;[-2,1])\). Compute \(\mathbb{E}\left[Z\right]\) using importance sampling.
Let \(x_{ij}\sim{}N(0,1)\) for \(i=1,\ldots,n\) and \(j=1,2\), and \(x_{i0}=1\) for \(i=1,\ldots,n\). Define \(x_{ij}^{T}=(x_{i0},x_{i1},x_{i2})^{T}\) and \(\beta=(1.2,0.3,-0.9)^{T}\) and let \(\epsilon_{i}\sim{}N(0,1)\) for \(i=1,\ldots,n\).

Simulate from the linear regression model:
\[ y_{i} = x_{i}^{T}\beta + \epsilon_{i} , \quad i=1,\ldots,n , \]
with n=100. Use the bootstrap procedure to estimate \(\textrm{SD}(\hat{\beta})\) based on \(B=1000\) bootstrap resamples. Compare to the asymptotic results reported by lm or computed using the square root of the diagonal elements of \(\hat{\sigma^{2}}(X^{T}X)^{-1}\).
In this question you will fork the course GitHub repo and upload your homework code to all previous homework questions to the repo. Go to https://github.com/STA250/Stuff.

Click on the "Fork" button:
Wait for the fork to complete. When it does, you will be taken to the newly forked repo. For example:

This has forked the repo to your GitHub.com account, but the repo is not stored on your laptop/desktop at this point.
On Mac/Windows, load the GitHub GUI you should have installed, and then proceed as below:
Go to the GitHub GUI, click on your username under the "GitHub.com" tab, click on "Refresh" (on the bottom of the screen). The forked repo should appear with the option to "Clone to Computer". For example:
Click "Clone to Computer". Select where to save the repo, and wait for the clone to complete:

Congratulations, you have now successfully forked the course repo. Any time you need to update the repo (e.g., if Prof. Baines posts new code/slides/assignments) you can click on "My Repositories" and the forked "Stuff" repo. Then click on the "Sync Branch" icon in the top left of the GUI:

Note: In general, you will need to commit any changes for the sync to proceed.

For the Linux-ers on Mac command-line folks, you can do all of the above via (with obvious modifications):

git config --global user.name "John Doe"
git config --global user.email johndoe@example.com
git clone https://github.com/[yourgithubusername]/[yourforkedrepo].git

To get the current status (note: you must be in the local repo directory):

cd Stuff # Change to the newly downloaded repo
ls # Check files are there… 
git status # Should be up-to-date

To get the latest updates:

git pull

Again, any local changes must be committed prior to the git pull request.

Now, move your HW0.(R|py) into the HW0 directory on your local machine. If you now run git status (or check the status via the GUI) you should receive a message informing you that you have uncommitted files.
If using the command line, run (from the local repo directory):
```
git add HW0/HW0.py
```
or similar, depending on what your code file is called. This will add the file to the repo.
Next, commit the change to the repo (run this from the local repo directory):
```
git commit -a -m "Add a description of the changes you made."
```
Make sure to use a descriptive message for the update e.g., "Added HW0 code". This stages the commit on your local machine, but the commit will not appear on GitHub.com until you push it to the site. If using the GUI you can commit via the "Changes" tab. In the "Commit Summary" box, enter a message for the update, then click "Commit". The committed changes should appear in the "Unsynced Commits" tab.
To push the change to GitHub, from the local repo directory:
```
git push 
```
Using the GUI, push to GitHub using the "Sync Branch" button.
Voila. If you go to your GitHub account, and navigate to the HW0 folder of the forked repo, you should see your homework code! :)

STA 250 :: Advanced Statistical Computing (UCD, Fall 2013)

STA 250 :: Homework 00

Homework 0 (No Credit -- Practice Only)

Due: In Class, 5:30pm Wed October 9th

Assigned: Wednesday Oct 2nd

(: Happy Coding! :)