Get data from body of emails, then create a dataframe from it (outlook)

Hi there, each morning I get an email listing the sales of fruit and veg for the day. What I’d like to do is create a code that will search my outlook emails for a particular email subject, iterate line by line through the body of those emails for the fruit data, take the ‘Date of report’ from the body and then tie it altogether in a datatable.

The structure of the emails is as follows:

Date of report:,22-JAN-2022
Time report produced:,23-JAN-2022 00:11
Apples,5
Grapes,8
Grapefruit,6
Lemons,5
Oranges,1
Pears,2

I was able to achieve the desired result in Python (see code below), however as time has passed it’s become necessary to achieve the result in R instead. Now, I realise there are ways to interface between Python and R by using the likes of RPy, but unfortunately that isn’t going to be suitable for me.

Fruit_1 = "Apples"
Fruit_2 = "Grapefruit"
Fruit_3 = "Oranges"

fruit_search = [Fruit_1, Fruit_2, Fruit_3]

def get_vals(report, searches):
    dct = {}
    for line in report:
        term, *value = line
        if term.casefold().startswith('date'):
            dct['date'] = pd.to_datetime(value[0])
        elif term in searches:
            dct[term] = float(value[0])
    if len(dct.keys()) != len(searches):
        dct.update({x: None for x in searches if x not in dct})
    return dct

outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6) 
messages = inbox.Items
messages.Sort("[ReceivedTime]", False)

result = []

for message in messages:
    if message.subject == 'FRUIT SALES':
        if Fruit_1 in message.body and Fruit_2 in message.body:
            data = [line.strip().split(",") for line in message.body.split('\n')]
            result.append(get_vals(data, fruit_search))
        else:
            pass

Fruit_vals = pd.DataFrame(result)
Fruit_vals.columns = map(str.upper, Fruit_vals.columns)

Now, I’m horrendously new to R; but I’ve been using what I did in Python as a bit of a guide and I’ve got as far as the function:

Fruit_1 <- "Apples"
Fruit_2 <- "Grapefruit"
Fruit_3 <- "Oranges"

fruit_search <- c(Fruit_1, Fruit_2, Fruit_3) ## Assuming the vectors work similarly to the previous
OutApp <- COMCreate("Outlook.Application")
outlookNameSpace = OutApp$GetNameSpace("MAPI")

search <- OutApp$AdvancedSearch("Inbox", "urn:schemas:httpmail:subject = 'FRUIT SALES'")
inbox <- outlookNameSpace$Folders(6)$Folders("Inbox")

get_vals <- function(report,searches) {
  data <- read.table(text=report,sep=",")
  colnames(data) <- c('key','value')

Conceptually I get stuck around creating the function, and though I would like to use an R vector (fruit_search), what I’ve been reading seems to suggest vectors won’t work in a comparable way to the fruit_search in python.

Any advice would be very much appreciated. :slightly_smiling_face: