Exercises

Write a function called findMister() that, when given any string, will return a character vector of the words that immediately follow the string “Mister,” with exactly one space in between. The function should take a single argument called str, the string to search. A typical example of use is as follows:
```
text <- "Here are Mister Tom, MisterJerry, Mister Mister, and Mister\tJoe."
findMister(text)
```
```
## [1] "Tom"    "Mister"
```
Write a function called findMr() that, when given any string, will return a character vector of all words following the string “Mr.” with exactly one space in between. The function should take a single argument called str, the string to search. A typical example of use is as follows:
```
text <- "Here are Mr. Tom, Mr Jerry, Mr. Mister, and Mr.\tJoe."
findMr(text)
```
```
## [1] "Tom"    "Mister"
```
For each of the following expressions, write a regular expression to test whether any of the sub-string(s) described occur in a given string. The regular expression should match any of the sub-strings described, and should not match any other sub-string. Try to make the regular expression as short as possible. Write the regular expression as a string that could be used in one of R’s regex functions (i.e. extra backslash escapes as needed). The first item is done for you, as an example.
- bot and bat. Regex string: "b[oa]t". (This is the one to submit, because it’s shorter than other alternatives such as"box|bat").
- cart and cars and carp.
- slick and sick
- Any word ending in ity (such as velocity and ferocity). Be sure to pay attention to word-boundaries. You should match velocity but not velocity (includes a space before the “v”) or velocity;.
- A whole number consisting of more than six digits.
- A word that is between 3 and 6 characters long. Pay attention to word-boundaries.
- One or more white-space characters, followed by a hyphen or a semicolon or a colon.
Write a function called findTitled() that, when given any string, will return a character vector of all words following any one of these titles:
- “Mr.”
- “Mister”
- “Missus”
- “Mrs.”
- “Miss”
- “Ms.”
There should be exactly one space between the title and the following word. The function should take a single argument called str, the string to search. A typical example of use is as follows:
```
text <- "Here are Mr. Tom, Ms. Thatcher, Miss Ellen, and Helen."
findTitled(text)
```
```
## [1] "Tom"      "Thatcher" "Ellen"
```
Write a function called capRepeats() that, when given a string, searches for all repeated-word pairs (with at least one character of white-space in between) and replaces them with the same pair where all letters are capitalized. The function should take a single argument called str, the string to be searched. A typical example of use would be as follows:
```
capRepeats("I have a boo boo on my knee    \tknee!")
```
```
## [1] "I have a BOO BOO on my KNEE    \tKNEE!"
```
Use str_subset() to write a function called longWord() that, when given a character vector of strings, returns a vector consisting of the strings that contain a word at least eight characters long. The function should take a single argument called strs. An example of use would be:
```
myText <- c("Very short words.", "Got a gargantuan word.", "More short words!")
longWord(strs = myText)
```
```
## [1] "Got a gargantuan word."
```
Write a function called longWord2() that, when given a character vector of strings, returns a list of character vectors, where each vector consists of the words in the corresponding string that are at least eight characters long. The function should take a single argument called strs. An example of use would be:
```
myText <- c("Very short words.", "Got a gargantuan word.", "More short words!")
longWord2(strs = myText)
```
```
## [[1]]
## character(0)
## 
## [[2]]
## [1] "gargantuan"
## 
## [[3]]
## character(0)
```
Write a function called phoneNumber() that, when given a vector of strings returns a logical vector indicating which of the strings contain a valid phone number. For our purposes a valid phone number shall be any string of the form

xxx-xxx-xxxx

or

xxx.xxx.xxx

Thus, 502-863-8111 is valid and so is 502.863.8111, but not 502-863.8111.

In the code for the function, specify the pattern using (?x) so you can ignore whitespace and leave detailed comments for each portion of the regular expression.

The function should take a single parameter called strs. A typical example of use would be:
```
sentences <- c("Ted's number is 606-255-3143.",
               "Rhonda's number is 403-28-1259.",
               "Lydia's number is 502.255.3921.",
               "Raj's number is 502.367-4432.")
phoneNumber(strs = sentences)
```
```
## [1]  TRUE FALSE  TRUE FALSE
```