Modifying Data AttributeError: module 'string' has no attribute 'lower'

Hello, I am new in programming. I am doing the data science path, currently in module introduction to pandas.

I download Jupiter Notebook using Anaconda as suggested in the course.

I’m trying to replicate the lesson “Performing Column Operations” on my computer (https://www.codecademy.com/paths/data-science/tracks/data-processing-pandas/modules/dspath-intro-pandas/lessons/pandas-ii/exercises/columns-apply.) but when I tried to change all the strings of one column to uppercase or lowercase, I received the error: “AttributeError: module ‘string’ has no attribute ‘lower’.”

I tried with “upper” instead of “lower” and I received the same error.

I tried: from string import upper and returns:
ImportError: cannot import name ‘upper’ from ‘string’ (C:\Users\gland\anaconda3\lib\string.py)

Then I tried import string and I didn’t receive any error but when I tried: df[‘estado’] = df.estado.apply[string.upper], I have the error: AttributeError: module ‘string’ has no attribute ‘upper’.

I have checked many videos on youtube and I have googled the error but I don’t know how to resolve it.

Can someone help me?

.upper() and .lower() are methods, not attributes.

Do you know the difference? I haven’t taken the panda courses, have classes already been covered?

1 Like

Hi stetim94, thank you for your assistance.

I am very new to this. To be honest I don’t know the difference between methods and attributes. I am trying to replicate the lesson but is not working.

The lesson said:

We can use the apply function to apply a function to every value in a particular column. For example, this code overwrites the existing 'Name' columns by applying the function upper to every row in 'Name' .

from string import upper 

df['Name'] = df.Name.apply(upper)

They also have another example;
image

I tried both ways:

  1. from string import upper, but i received the error: “ImportError: cannot import name ‘upper’ from ‘string’ (C:\Users\gland\anaconda3\lib\string.py)”

  2. import string and then df[‘estado’] = df.estado.apply[string.upper] but it returned the error: “AttributeError: module ‘string’ has no attribute ‘upper’.”

What do you suggest to me?

its not even the built-in upper method.

from string import upper and then string.upper looks fine. Except that .appy() is a method, so not sure why you use square brackets

The brackets instead of the parenthesis was a mistake, I just changed it but I still receive the same error.

df['estado'] = df.estado.apply(string.upper)

AttributeError Traceback (most recent call last)
in
----> 1 df[‘estado’] = df.estado.apply(string.upper)

AttributeError: module ‘string’ has no attribute ‘upper’

Can i see your full code?

Hi,

Find below the screenshots or you prefer the actual file?


I had no idea this was Panda specific. Sorry i responded, not sure if i am the right person to help you. I can find nothing on the string module, and all can find on upper are one mention here:

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html

and something on Series:

https://pandas.pydata.org/pandas-docs/stable/user_guide/text.html

other then that, no idea.

I looked in one of the lessons, and I also saw import codecademylib. Maybe import string is something from the codecademy library?

applying my general programming knowledge, you could run:

dir(string)

this will tell you inside the string module. I would like to see that output.

Yes it is pandas.

I dir string and this is the output :

$ cat string.py
"""A collection of string constants.

Public module variables:

whitespace -- a string containing all ASCII whitespace
ascii_lowercase -- a string containing all ASCII lowercase letters
ascii_uppercase -- a string containing all ASCII uppercase letters
ascii_letters -- a string containing all ASCII letters
digits -- a string containing all ASCII decimal digits
hexdigits -- a string containing all ASCII hexadecimal digits
octdigits -- a string containing all ASCII octal digits
punctuation -- a string containing all ASCII punctuation characters
printable -- a string containing all ASCII characters considered printable

"""

__all__ = ["ascii_letters", "ascii_lowercase", "ascii_uppercase", "capwords",
           "digits", "hexdigits", "octdigits", "printable", "punctuation",
           "whitespace", "Formatter", "Template"]

import _string

# Some strings for ctype-style character classification
whitespace = ' \t\n\r\v\f'
ascii_lowercase = 'abcdefghijklmnopqrstuvwxyz'
ascii_uppercase = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
ascii_letters = ascii_lowercase + ascii_uppercase
digits = '0123456789'
hexdigits = digits + 'abcdef' + 'ABCDEF'
octdigits = '01234567'
punctuation = r"""!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~"""
printable = digits + ascii_letters + punctuation + whitespace

# Functions which aren't available as string methods.

# Capitalize the words in a string, e.g. " aBc  dEf " -> "Abc Def".
def capwords(s, sep=None):
    """capwords(s [,sep]) -> string

    Split the argument into words using split, capitalize each
    word using capitalize, and join the capitalized words using
    join.  If the optional second argument sep is absent or None,
    runs of whitespace characters are replaced by a single space
    and leading and trailing whitespace are removed, otherwise
    sep is used to split and join the words.

    """
    return (sep or ' ').join(x.capitalize() for x in s.split(sep))


####################################################################
import re as _re
from collections import ChainMap as _ChainMap

class _TemplateMetaclass(type):
    pattern = r"""
    %(delim)s(?:
      (?P<escaped>%(delim)s) |   # Escape sequence of two delimiters
      (?P<named>%(id)s)      |   # delimiter and a Python identifier
      {(?P<braced>%(bid)s)}  |   # delimiter and a braced identifier
      (?P<invalid>)              # Other ill-formed delimiter exprs
    )
    """

    def __init__(cls, name, bases, dct):
        super(_TemplateMetaclass, cls).__init__(name, bases, dct)
        if 'pattern' in dct:
            pattern = cls.pattern
        else:
            pattern = _TemplateMetaclass.pattern % {
                'delim' : _re.escape(cls.delimiter),
                'id'    : cls.idpattern,
                'bid'   : cls.braceidpattern or cls.idpattern,
                }
        cls.pattern = _re.compile(pattern, cls.flags | _re.VERBOSE)


class Template(metaclass=_TemplateMetaclass):
    """A string class for supporting $-substitutions."""

    delimiter = '$'
    # r'[a-z]' matches to non-ASCII letters when used with IGNORECASE, but
    # without the ASCII flag.  We can't add re.ASCII to flags because of
    # backward compatibility.  So we use the ?a local flag and [a-z] pattern.
    # See https://bugs.python.org/issue31672
    idpattern = r'(?a:[_a-z][_a-z0-9]*)'
    braceidpattern = None
    flags = _re.IGNORECASE

    def __init__(self, template):
        self.template = template

    # Search for $$, $identifier, ${identifier}, and any bare $'s

    def _invalid(self, mo):
        i = mo.start('invalid')
        lines = self.template[:i].splitlines(keepends=True)
        if not lines:
            colno = 1
            lineno = 1
        else:
            colno = i - len(''.join(lines[:-1]))
            lineno = len(lines)
        raise ValueError('Invalid placeholder in string: line %d, col %d' %
                         (lineno, colno))

    def substitute(*args, **kws):
        if not args:
            raise TypeError("descriptor 'substitute' of 'Template' object "
                            "needs an argument")
        self, *args = args  # allow the "self" keyword be passed
        if len(args) > 1:
            raise TypeError('Too many positional arguments')
        if not args:
            mapping = kws
        elif kws:
            mapping = _ChainMap(kws, args[0])
        else:
            mapping = args[0]
        # Helper function for .sub()
        def convert(mo):
            # Check the most common path first.
            named = mo.group('named') or mo.group('braced')
            if named is not None:
                return str(mapping[named])
            if mo.group('escaped') is not None:
                return self.delimiter
            if mo.group('invalid') is not None:
                self._invalid(mo)
            raise ValueError('Unrecognized named group in pattern',
                             self.pattern)
        return self.pattern.sub(convert, self.template)

    def safe_substitute(*args, **kws):
        if not args:
            raise TypeError("descriptor 'safe_substitute' of 'Template' object "
                            "needs an argument")
        self, *args = args  # allow the "self" keyword be passed
        if len(args) > 1:
            raise TypeError('Too many positional arguments')
        if not args:
            mapping = kws
        elif kws:
            mapping = _ChainMap(kws, args[0])
        else:
            mapping = args[0]
        # Helper function for .sub()
        def convert(mo):
            named = mo.group('named') or mo.group('braced')
            if named is not None:
                try:
                    return str(mapping[named])
                except KeyError:
                    return mo.group()
            if mo.group('escaped') is not None:
                return self.delimiter
            if mo.group('invalid') is not None:
                return mo.group()
            raise ValueError('Unrecognized named group in pattern',
                             self.pattern)
        return self.pattern.sub(convert, self.template)



########################################################################
# the Formatter class
# see PEP 3101 for details and purpose of this class

# The hard parts are reused from the C implementation.  They're exposed as "_"
# prefixed methods of str.

# The overall parser is implemented in _string.formatter_parser.
# The field name parser is implemented in _string.formatter_field_name_split

class Formatter:
    def format(*args, **kwargs):
        if not args:
            raise TypeError("descriptor 'format' of 'Formatter' object "
                            "needs an argument")
        self, *args = args  # allow the "self" keyword be passed
        try:
            format_string, *args = args # allow the "format_string" keyword be passed
        except ValueError:
            raise TypeError("format() missing 1 required positional "
                            "argument: 'format_string'") from None
        return self.vformat(format_string, args, kwargs)

    def vformat(self, format_string, args, kwargs):
        used_args = set()
        result, _ = self._vformat(format_string, args, kwargs, used_args, 2)
        self.check_unused_args(used_args, args, kwargs)
        return result

    def _vformat(self, format_string, args, kwargs, used_args, recursion_depth,
                 auto_arg_index=0):
        if recursion_depth < 0:
            raise ValueError('Max string recursion exceeded')
        result = []
        for literal_text, field_name, format_spec, conversion in \
                self.parse(format_string):

            # output the literal text
            if literal_text:
                result.append(literal_text)

            # if there's a field, output it
            if field_name is not None:
                # this is some markup, find the object and do
                #  the formatting

                # handle arg indexing when empty field_names are given.
                if field_name == '':
                    if auto_arg_index is False:
                        raise ValueError('cannot switch from manual field '
                                         'specification to automatic field '
                                         'numbering')
                    field_name = str(auto_arg_index)
                    auto_arg_index += 1
                elif field_name.isdigit():
                    if auto_arg_index:
                        raise ValueError('cannot switch from manual field '
                                         'specification to automatic field '
                                         'numbering')
                    # disable auto arg incrementing, if it gets
                    # used later on, then an exception will be raised
                    auto_arg_index = False

                # given the field_name, find the object it references
                #  and the argument it came from
                obj, arg_used = self.get_field(field_name, args, kwargs)
                used_args.add(arg_used)

                # do any conversion on the resulting object
                obj = self.convert_field(obj, conversion)

                # expand the format spec, if needed
                format_spec, auto_arg_index = self._vformat(
                    format_spec, args, kwargs,
                    used_args, recursion_depth-1,
                    auto_arg_index=auto_arg_index)

                # format the object and append to the result
                result.append(self.format_field(obj, format_spec))

        return ''.join(result), auto_arg_index


    def get_value(self, key, args, kwargs):
        if isinstance(key, int):
            return args[key]
        else:
            return kwargs[key]


    def check_unused_args(self, used_args, args, kwargs):
        pass


    def format_field(self, value, format_spec):
        return format(value, format_spec)


    def convert_field(self, value, conversion):
        # do any conversion on the resulting object
        if conversion is None:
            return value
        elif conversion == 's':
            return str(value)
        elif conversion == 'r':
            return repr(value)
        elif conversion == 'a':
            return ascii(value)
        raise ValueError("Unknown conversion specifier {0!s}".format(conversion))


    # returns an iterable that contains tuples of the form:
    # (literal_text, field_name, format_spec, conversion)
    # literal_text can be zero length
    # field_name can be None, in which case there's no
    #  object to format and output
    # if field_name is not None, it is looked up, formatted
    #  with format_spec and conversion and then used
    def parse(self, format_string):
        return _string.formatter_parser(format_string)


    # given a field_name, find the object it references.
    #  field_name:   the field being looked up, e.g. "0.name"
    #                 or "lookup[3]"
    #  used_args:    a set of which args have been used
    #  args, kwargs: as passed in to vformat
    def get_field(self, field_name, args, kwargs):
        first, rest = _string.formatter_field_name_split(field_name)

        obj = self.get_value(first, args, kwargs)

        # loop through the rest of the field_name, doing
        #  getattr or getitem as needed
        for is_attr, i in rest:
            if is_attr:
                obj = getattr(obj, i)
            else:
                obj = obj[i]

        return obj, first

too me, import string looks like this module:

https://docs.python.org/3/library/string.html

which is a standard python module.

But as we can see, no way to convert to upper or lowercase.

is it possible to get the values from the column in python? I would do that, then call .lower() method, and apply that result

Hello Gustavo,
I think the issue might be that you’re applying the method to an object and not a string
If you look at the data type in the .info() you will see that the column you’re attempting to apply the .apply function to is an object.
If you can change the dtype of that column estado to a string then I think you can use .apply()

I did the following:

But it’s returning only the column estados, how can I get all the columns in the data frame with the column estados on lowercase?

Hi

Thank you for your help. Yes I can see that is an object, what do you think could be the solution to my problem?

I found this explanation of attribute errors:

https://www.pythonpool.com/attribute-error-python/

And, I’m looking into converting that col. I think it’s in this StackOverflow post here:

https://stackoverflow.com/questions/33957720/how-to-convert-column-with-dtype-as-object-to-string-in-pandas-dataframe

I’m still researching this. I think I might be wrong though on having to convert that column’s dtype…

I also found this article on .apply function:
https://www.delftstack.com/howto/python-pandas/how-to-convert-pandas-dataframe-column-to-string/

The Pandas dtype “object” is a string or mixed.
https://pbpython.com/pandas_dtypes.html

Thank you.

I wonder why when I run from string import upper it returns the error: ImportError: cannot import name ‘upper’ from ‘string’?

I have never imported from string import upper to change a column from all caps to first letter cap, then lowercase.

It’s easier to do in Excel! (And I never thought I’d say something like that. hahaha). :rofl:

If it’s a csv file…is it possible for you to do some data cleanup on the columns beforehand and then import the .csv file into pandas df?

In addition to import pandas as pd did you also import numpy as np in your Notebook?

I am trying to escape from Excel using python and pandas :-).

I could change that in Excel, but I’m trying to apply what I’m learning in this course, but it’s frustrating that it doesn’t work :frowning:

It seems to be very simple, but for some reason it doesn’t work.

1 Like

Ah, okay, I understand getting away from Excel.
I say this a LOT.

This might be helpful intro to dtypes in Pandas and Numpy:
https://pbpython.com/pandas_dtypes.html