Why am I getting an 'IndexError: string index out of range' - Please Help!

I will provide two set of codes. The first one works fine. In the second I use a concatenation formula to draw data from two csv files and the use the same code for the newly created dataframe and I get an error ‘IndexError: string index out of range’…

Code #1

import csv

csv_file = csv.reader(open(‘epl1617.csv’))
next(csv_file)

upsets = 0
non_upsets = 0

starting_bankroll = 100
wagering_size = 5

bankroll = starting_bankroll

for game in csv_file:
home_team = game[2]
away_team = game[3]

home_goals = int(game[4])
away_goals = int(game[5])

home_odds = float(game[23])
draw_odds = float(game[24])
away_odds = float(game[25])

if home_odds > away_odds:
	if home_goals > away_goals:
		upsets += 1
		bankroll += wagering_size * (home_odds - 1)
	else:
		non_upsets += 1
		bankroll -= wagering_size

ROI = ((bankroll - starting_bankroll) / (wagering_size * (upsets + non_upsets))) * 100

print (“There were ‘%s’ upsets out of ‘%s’ total matches” % (upsets, upsets + non_upsets))
print (“Starting bankroll = ‘%s’” % (starting_bankroll))
print (“Finishing bankroll = ‘%s’ | ROI = ‘%s’” % (bankroll, ROI))

Code #2

import pandas as pd
import glob

path = r’Q:\Users\Panagiotis\Betting formula\EPL Seasons’ # use your path
all_files = glob.glob(path + “/*.csv”)

li =

for filename in all_files:
df = pd.read_csv(filename, index_col=None, header=0)
li.append(df)

frame = pd.concat(li, axis=0, ignore_index=True)

upsets = 0
non_upsets = 0

starting_bankroll = 100
wagering_size = 5

bankroll = starting_bankroll

for game in frame:
home_team = game[2]
away_team = game[3]

home_goals = int(game[4])
away_goals = int(game[5])

home_odds = float(game[23])
draw_odds = float(game[24])
away_odds = float(game[25])

if home_odds > away_odds:
	if home_goals > away_goals:
		upsets += 1
		bankroll += wagering_size * (home_odds - 1)
	else:
		non_upsets += 1
		bankroll -= wagering_size

ROI = ((bankroll - starting_bankroll) / (wagering_size * (upsets + non_upsets))) * 100

print (“There were ‘%s’ upsets out of ‘%s’ total matches” % (upsets, upsets + non_upsets))
print (“Starting bankroll = ‘%s’” % (starting_bankroll))
print (“Finishing bankroll = ‘%s’ | ROI = ‘%s’” % (bankroll, ROI))

Any feedback is greatly appreciated.

Thank you,

Panagiotis

Not really sure but i think your data has not yet been prepared for pd.concat().
In the documentation they use pd.series() before using pd.concat(). Could this be your problem ?

Hello Biirra,

Thank you for your rely!

I don’t think that’s the issue. The first part of the code takes the 2 csv files and via pd. read_csv() turns them into dataframes, appends them to a list and then concatenates the objects of that list hence the newly created dataframes. I can also see from the variable explorer that the type of variable frame is indeed a DataFrame.

In addition, if I print(frame) I can see the concatenated data on the console…