python用于数据挖掘的包_小白的Python学习笔记(十七)数据挖掘常用包- Pandas(6)...

有关字符串基本方法

大家好,我又回来了! 之前的几期我们简单了解了pandas的基础操作,但是只要涉及到数据,最常见的就是String(字符串)类型,因此很多时候我们其实都在和字符串打交道,所以今天,我会把有关字符串的常用方法分享,希望能够帮到各位小伙伴~

Split and format

latitude = '37.24N'

longitude = '-115.81W'

'Coordinates {0},{1}'.format(latitude,longitude)

>>> 'Coordinates 37.24N,-115.81W'

复制代码f'Coordinates {latitude},{longitude}'

>>>'Coordinates 37.24N,-115.81W'

复制代码'{0},{1},{2}'.format(*('abc'))

>>>'a,b,c'

复制代码coord = {"latitude":latitude,"longitude":longitude}

'Coordinates {latitude},{longitude}'.format(**coord)

>>>'Coordinates 37.24N,-115.81W'

复制代码

Access argument' s attribute

class Point:

def __init__(self,x,y):

self.x,self.y = x,y

def __str__(self):

return 'Point({self.x},{self.y})'.format(self = self)

def __repr__(self):

return f'Point({self.x},{self.y})'

复制代码test_point = Point(4,2)

test_point

>>> Point(4,2)

复制代码str(Point(4,2))

>>>'Point(4,2)'

复制代码

Replace with %s , %r :

" repr() shows the quote {!r}, while str() doesn't:{!s} ".format('a1','a2')

>>>" repr() shows the quote 'a1', while str() doesn't:a2 "

复制代码

Align :

'{:<30}'.format('left aligned')

>>>'left aligned '

复制代码'{:>30}'.format('right aligned')

>>>' right aligned'

复制代码'{:^30}'.format('centerd')

>>>' centerd '

复制代码'{:*^30}'.format('centerd')

>>>'***********centerd************'

复制代码

Replace with %x , %o :

"int:{0:d}, hex:{0:x}, oct:{0:o}, bin:{0:b}".format(42)

>>>'int:42, hex:2a, oct:52, bin:101010'

复制代码'{:,}'.format(12345677)

>>>'12,345,677'

复制代码

Percentage :

points = 19

total = 22

'Correct answers: {:.2%}'.format(points/total)

>>>'Correct answers: 86.36%'

复制代码

Date :

import datetime as dt

f"{dt.datetime.now():%Y-%m-%d}"

>>>'2019-03-27'

复制代码f"{dt.datetime.now():%d_%m_%Y}"

>>>'27_03_2019'

复制代码today = dt.datetime.today().strftime("%d_%m_%Y")

today

复制代码'27_03_2019'

复制代码

Split without parameters :

"this is a test".split()

>>>['this', 'is', 'a', 'test']

复制代码

Concatenate :

'do'*2

>>>'dodo'

复制代码orig_string ='Hello'

orig_string+',World'

>>>'Hello,World'

复制代码full_sentence = orig_string+',World'

full_sentence

>>>'Hello,World'

复制代码

Check string type , slice,count,strip :

strings = ['do','re','mi']

', '.join(strings)

>>>'do, re, mi'

复制代码'z' not in 'abc'

>>>True

复制代码ord('a'), ord('#')

>>>(97, 35)

复制代码chr(97)

>>>'a'

复制代码s = "foodbar"

s[2:5]

>>>'odb'

复制代码s[:4] + s[4:]

>>>'foodbar'

复制代码s[:4] + s[4:] == s

>>>True

复制代码t=s[:]

id(s)

>>>1547542895336

复制代码id(t)

>>>1547542895336

复制代码s is t

>>>True

复制代码s[0:6:2]

>>>'fob'

复制代码s[5:0:-2]

>>>'ado'

复制代码s = 'tomorrow is monday'

reverse_s = s[::-1]

reverse_s

>>>'yadnom si worromot'

复制代码s.capitalize()

>>>'Tomorrow is monday'

复制代码s.upper()

>>>'TOMORROW IS MONDAY'

复制代码s.title()

>>>'Tomorrow Is Monday'

复制代码s.count('o')

>>>4

复制代码"foobar".startswith('foo')

>>>True

复制代码"foobar".endswith('ar')

>>>True

复制代码"foobar".endswith('oob',0,4)

>>>True

复制代码"foobar".endswith('oob',2,4)

>>>False

复制代码"My name is yo, I work at SG".find('yo')

>>>11

复制代码# If can't find the string, return -1

"My name is ya, I work at Gener".find('gent')

>>>-1

复制代码# Check a string if consists of alphanumeric characters

"abc123".isalnum()

>>>True

复制代码"abc%123".isalnum()

>>>False

复制代码"abcABC".isalpha()

>>>True

复制代码"abcABC1".isalpha()

>>>False

复制代码'123'.isdigit()

>>>True

复制代码'123abc'.isdigit()

>>>False

复制代码'abc'.islower()

>>>True

复制代码"This Is A Title".istitle()

>>>True

复制代码"This is a title".istitle()

>>>False

复制代码'ABC'.isupper()

>>>True

复制代码'ABC1%'.isupper()

>>>True

复制代码'foo'.center(10)

>>>' foo '

复制代码' foo bar baz '.strip()

>>>'foo bar baz'

复制代码' foo bar baz '.lstrip()

>>>'foo bar baz '

复制代码' foo bar baz '.rstrip()

>>>' foo bar baz'

复制代码"foo abc foo def fo ljk ".replace('foo','yao')

>>>'yao abc yao def fo ljk '

复制代码'www.realpython.com'.strip('w.moc')

>>>'realpython'

复制代码'www.realpython.com'.strip('w.com')

>>>'realpython'

复制代码'www.realpython.com'.strip('w.ncom')

>>>'realpyth'

复制代码

Convert to lists :

', '.join(['foo','bar','baz','qux'])

>>>'foo, bar, baz, qux'

复制代码list('corge')

>>>['c', 'o', 'r', 'g', 'e']

复制代码':'.join('corge')

>>>'c:o:r:g:e'

复制代码'www.foo'.partition('.')

>>>('www', '.', 'foo')

复制代码'foo@@bar@@baz'.partition('@@')

>>>('foo', '@@', 'bar@@baz')

复制代码'foo@@bar@@baz'.rpartition('@@')

>>>('foo@@bar', '@@', 'baz')

复制代码'foo.bar'.partition('@@')

>>>('foo.bar', '', '')

复制代码# By default , rsplit split a string with white space

'foo bar adf yao'.rsplit()

>>>['foo', 'bar', 'adf', 'yao']

复制代码'foo.bar.adf.ert'.split('.')

>>>['foo', 'bar', 'adf', 'ert']

复制代码'foo\nbar\nadfa\nlko'.splitlines()

>>>['foo', 'bar', 'adfa', 'lko']

复制代码

总结

除了我以上总结的这些,还有太多非常实用的方法,大家可以根据自己的需求去搜索啦!

我把这一期的ipynb文件和py文件放到了Github上,大家如果想要下载可以点击下面的链接:

你可能感兴趣的:(python用于数据挖掘的包)