- python
- 环境配置
- python 慢的原因
- import and from
- if
- while, for(循环)
- match case(模式匹配): 需要python 3.10
- 函数式编程
- 数据类型
- sorted
- def(函数)
- class(类)
- weakerf(弱引用)
- file
- random(随机数)
- 日志
- lib(库)
- cython
- mingshe: 语法糖
- PEP 20: pythonic(python之禅)
- test: 测试
- draw: 画图
- system: 系统编程
- app: 普通应用
- concurrency: 进程, 线程, 协程
- scientific computing: 科学计算
- network: 网络
- spider: 网络爬虫和自动化测试
- debug: 调试
- algorithms: 算法
- Design Pattern: 设计模式
- Geometry: 几何
- reference article(优秀文章)
- 第三方软件资源
- 学习项目
- 在线工具
-
强类型的动态类型语言
-
编程语言流行排行榜:python在以下排行榜都是第一
-
可以显示plot画的图
-
在终端下的jupter notebooks
-
uv pip install
取代pip isntall
-
uv pip compile
取代pip-compile
-
uv pip sync
取代pip-sync
-
uv venv
取代python -m venv
-
pip install pkg
普通用户的安装路径:~/.local/lib/python3.9
-
sudo pip install pkg
sudo表示全局的安装路径:- 有些模块像
scapy
需要root权限. 有些模块会破坏依赖, 因此不能使用sudo安装
/usr/lib/python3.9/
- 有些模块像
-
pip install git+https://github.com/mli/autocut.git
安装github下的模块 -
安装过程:
-
先
build
, 后install
两个阶段是分开的, 可以由不同的工具完成-
1.
build
: 把源码构建为wheel(.whl文件) -
2.
install
: 把wheel解压, 将文件移动到对应的目录-
pypi有些包, 操作系统提供wheel的下载(可以跳过build), 而有些则需要build
-
wheel的文件小于源代码
-
--no-binary=:all:
参数告诉pip即使有wheel文件, 也要下载源码进行本地构建
pip install \ --no-binary=:all: \ package
-
-
-
-
PEP660: editable build backend
- 修改源代码后不需要构建wheel, 即可使用, 但python程序还需重启(接近热更新)
pip install -e .
- 目前backend使用
pdm
的代理模式
-
pip install something报错:
ModuleNotFoundError: No module named 'pip._vendor.packaging'
# 解决方法
curl -sS https://bootstrap.pypa.io/get-pip.py | sudo python3
多版本共存或相互隔离
- 安装两个包:
pyenv
pyenv-virtualenv
# 获取可安装的版本
pyenv install --list
# 安装2.7.18版本
pyenv install -v 2.7.18
# 查看当前版本和可选版本
pyenv versions
pyenv global 2.7.18
-
virtualenv:
-
能管理纯python库, 但不能管理拓展库
-
不能管理python本身的编译
-
pip 也是独立的环境
-
# 终端1
eval "$(pyenv init -)"
eval "$(pyenv virtualenv-init -)"
pyenv virtualenv 2.7.13 first
pyenv activate first
# 终端2
eval "$(pyenv init -)"
eval "$(pyenv virtualenv-init -)"
pyenv virtualenv system second
pyenv activate second
# 删除虚拟环境
pyenv virtualenv-delete first
pyenv virtualenv-delete second
# 转换json格式
echo '{"1": "123", "2": "321"}' | python -m json.tool
# 共享当前目录下的文件
python -m http.server 8080
# 浏览器查看文档
python -m pydoc -p 1234
# 监控文件变动
python -m pyinotify -v /tmp
# 打开网页
python -m webbrowser -t "http://www.python.org"
# 生成pem, key证书文件
pip install trustme
python -m trustme -i baidu.com
官方的 python 版本是 cpython
CPython 会对代码进行一系列的读取、语法分析、解析、编译、解释和执行的操作。
-
GIL(Global Interpreter Lock)全局解释器锁:
-
JIT(Just-in-time):
-
原理:
-
1.通过一种中间语言, 将代码拆分成多个块
-
2.运行时依然使用字节码, JIT并没有提升字节码的运行速度
-
3.而是分析哪些代码会多次运行, 并标记为热点(hot spots), 最后对这些热点进行优化
-
-
缺点:
-
1.启动时间慢: cpython 本身的启动时间就很慢, 使用了JIT的pypy启动时间还要慢2-3倍
-
2.动态语言很难优化
- 比较和转换类型的成本很高,每次读取、写入或引用变量时,都会检查类型
-
-
-
Cython: 牺牲灵活性, 换取性能
-
Jpython: java 实现
Jython中的Python线程, 就是Java线程, 由JVM管理
-
Cpython:库是 C 写的
- GIL:没有使用引用计数
- JIT
-
PyPy:比 cpython2.7 版块 3 倍.库是 RPython 写的
- GIL
- JIT
-
Pyodide: 转换成WebAssembly在浏览器运行
-
Implementation plan for speeding up CPython
-
微软资助的5人开发团队(包括python之父)
-
要让cpython提升5倍, 分为4个阶段:
-
(1) 3.10 不再生成运行时代码
-
(2) 3.11 缩小int类型的位数; 提升运算符, 调用和返回的速度; 改进内存布局, 减少内存管理开销; 零开销的异常处理...
-
(3) 3.12 增加JIT
-
(4) 3.13 生成高级机械语言
-
最终目标是加几个执行层(tiers)
-
程序分为两部分:
-
热的(经常执行的): 只要执行速度提升, 即使加载慢一些, 消耗多一点内存也是有意义的
-
冷的(不经常执行的): 只要加载速度提升, 即使执行慢一些, 也是有意义的
-
-
冷热并不容易区分开了, 为了解决这一冷热范围的运行时特征(characteristics), 就要加入执行层
-
目前考虑0-3层, 层数越高, 代码越热:
-
cpython3.9被视为0-1层之间
-
开始时, 所有代码都是0层, 随着运行时间的增长进入更高层
-
0层: 更少的磁盘加载到内存的成本, 更少的内存使用
-
只运行一次和从不运行的代码, 超过1次进入1层
- 比如: 加载模块, 异常处理...
-
-
越往高层, 有着更多的优化, 对资源的使用限制更少
-
-
-
file test:
# 文件test
__all__ = ['a']
a = 1
b = 2
import
用法
创建新namespace, 只会加载一次
注意:import只在函数里有效
file test1:
# 文件test1
import test
print(test.a)
print(test.b)
from module import name, name1
用法
在当前
namespace
引用
注意:from不能在
class
,function
里使用
a = 0
from test import *
# a的值会被覆盖
print(a)
# __all__ = ['a']因为导入文件没有指定b,所以会报错
print(b)
# 此时再导入b
from test import b
print(b)
-
from导入函数应使用
()
代替\
# 两者一样 from timeit import timeit, repeat, \ main, reindent from timeit import (timeit, repeat, main, reindent)
-
注意只能在from语句可以使用(), import 则会报错
import timeit, sys,\ time, re # 报错 import (timeit, sys, time, re)
-
正确写法 | 错误写法 |
---|---|
if not v: or if v is None: |
if v == None: |
if v: |
if v != None: |
判断变量是否定义: if 'v' in locals():
-
or 赋值
b = None c = None # v = 1 v = b or 1 or c
- False, True
age = False s = age or 24 print(s) age = True s = age or 24 print(s)
输出
24 True
x = 1
# 异常
assert x > 1
- 自定义异常
class Myerror(Exception):
def __str__(self):
return 'define error'
def f():
raise Myerror()
try:
f()
except Myerror:
print('ok')
- 加入参数
class Myerror(Exception):
def __init__(self, message, status):
super().__init__(message, status)
self.message = message
self.status = status
def __str__(self):
return 'define error'
def f():
raise Myerror('error', 1)
try:
f()
except Myerror as e:
print(e.args)
-
使用
Exception
捕抓所有异常- 这三个异常SystemExit, KeyboardInterrupt, GeneratorExit除外, 如果要捕抓这三个使用
BaseException
- 这三个异常SystemExit, KeyboardInterrupt, GeneratorExit除外, 如果要捕抓这三个使用
except Exception as e:
- 在异常内, 抛出异常
def example():
try:
int('N/A')
except ValueError as e:
# raise ... from e
raise RuntimeError('A parsing error occurred') from e
example()
- 处理多个异常
try:
client_obj.get_url(url)
except (URLError, ValueError, SocketTimeout):
client_obj.remove_url(url)
- 对不同的异常, 进行不同的处理
try:
client_obj.get_url(url)
except (URLError, ValueError):
client_obj.remove_url(url)
except SocketTimeout:
client_obj.handle_url_timeout(url)
try:
f = open(filename)
except OSError as e:
if e.errno == errno.ENOENT:
logger.error('File not found')
elif e.errno == errno.EACCES:
logger.error('Permission denied')
else:
logger.error('Unexpected error: %d', e.errno)
import fuckit
@fuckit
def func():
pass
- 1.运行b函数
- 2.赋值a
- 3.判断a是否为
None
def b():
return 'not None'
# 普通写法
a = b()
if (a):
print(a)
# 海象运算符
if (a := b()):
print(a)
def f(x):
return x + 1
data = [1, 2, 3]
# 普通写法
results = []
for x in data:
result = f(x)
if result:
results.append(result)
# 普通写法1
results = [
f(x) for x in data
if f(x)
]
# 海象运算符
results = [
y for x in data
if (y := f(x))
]
# f(x)赋值y
stuff = [[y := f(x), x * y] for x in range(3)]
# False选第一个元素
state = False
array = ('0', '1', '2')[state]
print(array)
# True选第二个元素
state = True
array = ('0', '1', '2')[state]
print(array)
输出
0
1
-
python3的
range
代替 python2 的xrange
-
定义: 起点, 终点, 步进
for i in range(1, 12, 2): print(i) # 步进为负数, 表示反转(reverse) for i in range(12, 1, -2): print(i) for i in range(10_000, 1_000_001, 20_000): print(i)
-
*
运算符for a, *b in ([1, 2], [3, 4, 5]): print(a) print(b)
- 输出
1 [2] 3 [4, 5]
# 不执行else
for i in range(10):
if i == 9:
print(i)
break
else:
print("Not found")
# 执行else
for i in range(10):
if i == 11:
print(i)
break
else:
print("Not found")
-
基本使用
list1 = [0, 1, 2] # 普通写法. pythonic(这很不python) for index in range(len(list1)): print(index, n[index]) # enumerate()语法糖 for index, n in enumerate(list1): print(index, n)
输出:
0 0 1 1 2 2
-
enumerate(array, 1)
: 第二个参数表示对index + 1for index, n in enumerate(list1, 1): print(index, n)
输出:
1 0 2 1 3 2
- iter()
list1 = [0, 1, 2]
it = iter(list1)
next(it) # 0
next(it) # 1
next(it) # 2
next(it) # StopIteration
-
__iter__()
将迭代请求传递给内部的_list
属性 -
__next__()
返回下一个迭代 -
迭代文件的每一行
# 相当于cat /home/tz/test.py
with open('/home/tz/test.py') as file:
while True:
try:
line = next(file)
print(line)
except StopIteration:
exit(0)
- iter() 控制读取大小
import sys
with open('/home/tz/test.py') as file:
# 每次读10
for chunk in iter(lambda: file.read(10), ''):
sys.stdout.write(chunk)
- iter() 实现
range()
class range:
def __init__(self, x, y):
self.x = x
self.y = y
def __iter__(self):
return self
def __next__(self):
if self.x <= self.y:
x = self.x
self.x += 1
return x
else:
raise StopIteration
a = range(1, 10)
for i in a:
print(i)
-
islice() 实现切片
from itertools import islice # 等同于a[1:6] a = range(1, 10) for i in islice(a, 1, 6): print(i, end='') # 23456
-
permutations() 组合所有元素(不包含自身)
from itertools import permutations list1 = ['a', 'b', 'c'] for i in permutations(list1): print(i)
输出
('a', 'b', 'c') ('a', 'c', 'b') ('b', 'a', 'c') ('b', 'c', 'a') ('c', 'a', 'b') ('c', 'b', 'a')
- 定义组合的数量
from itertools import permutations list1 = ['a', 'b', 'c'] for i in permutations(list1, 2): print(i)
输出
('a', 'b') ('a', 'c') ('b', 'a') ('b', 'c') ('c', 'a') ('c', 'b')
-
combinations_with_replacement() 组合所有元素(包含自身)
from itertools import combinations_with_replacement list1 = ['a', 'b', 'c'] for i in combinations_with_replacement(list1, 3): print(i)
输出
('a', 'a', 'a') ('a', 'a', 'b') ('a', 'a', 'c') ('a', 'b', 'b') ('a', 'b', 'c') ('a', 'c', 'c') ('b', 'b', 'b') ('b', 'b', 'c') ('b', 'c', 'c') ('c', 'c', 'c')
-
zip_longest() 解决两个对象元素不等
from itertools import zip_longest # 对象元素不等 list1 = [1, 2, 3, 4] list2 = [10, 20, 30, 40, 50] # 设置填充元素的值为0 for i in zip_longest(list1, list2, fillvalue=0): print(i)
输出
(1, 10) (2, 20) (3, 30) (4, 40) (0, 50)
-
chain() 更好的 for x in list1 + list2:
from itertools import chain list1 = [1, 2, 3, 4] list2 = [10, 20, 30, 40] for x in chain(list1, list2): print(x)
输出
1 2 3 4 10 20 30 40
-
返回一个迭代器, 函数会暂停运行
- 生成器和普通函数不同, 只能用于迭代操作
-
next()
或__next__()
迭代下一次
def count(start, stop):
while True:
yield start
start += stop
yd = count(10, 1)
yd.__next__()
next(yd)
- yield 实现fib
class fib:
def __init__(self, n):
self.x, self.y = 0, 1
self.n = n
def __iter__(self):
x, y = self.x, self.y
while y <= self.n:
x, y = x + 1, y + x
yield y
a = fib(10)
for i in a:
print(i)
- yield 实现grep
def grep(pattern, filename):
with open(filename) as file:
for line in file.readlines():
if pattern in line:
yield line
get_elem = grep('2', '/tmp/test')
# 迭代下一次
get_elem.__next__()
next(get_elem)
from collections import Iterable
def flatten(items):
for x in items:
# 如果x是生成器, 就迭代自身
if isinstance(x, Iterable):
yield from flatten(x)
else:
yield x
list1 = [1, 2, [3, 4, [5, 6], 7], 8]
# 输出 1 2 3 4 5 6 7 8
for x in flatten(list1):
print(x)
send()函数向yield函数传递值
def f():
while True:
n = yield
print(n)
# test
r = f()
# send之前需要__next__()
r.__next__()
r.send('hello')
r.send('world')
输出
hello
world
- 通过装饰器包一层函数,让它自动__next__()
def wrapper(func):
def new_func():
r = func()
r.__next__()
return r
return new_func
# 消费者
@wrapper
def consumer():
while True:
# 接送producer
n = yield
print(n)
# 生产者
def producer(n):
r = consumer()
for i in range(n):
# 发送给consumer
r.send(i)
if __name__ == '__main__':
producer(10)
- send()给自己
# 即是生产者也是消费者
@wrapper
def producer():
while True:
n = yield
if n == 0:
break
print(n)
try:
# 发送给自己
my = producer()
my.send(n-1)
except StopIteration:
pass
if __name__ == '__main__':
r = producer()
r.send(10)
- 生产者负责控制步进, 消费者负责print()
# 消费者负责print()
@wrapper
def consumer():
while True:
# 接送生产者
n = yield
print(n)
# 生产者负责控制步进
@wrapper
def producer():
while True:
n = yield
if n == 0:
break
try:
# 发送给消费者
r = consumer()
r.send(n)
# 发送给自己
my = producer()
my.send(n-1)
except StopIteration:
pass
if __name__ == '__main__':
r = producer()
r.send(10)
- yield返回值(生成器)
def f():
m = None
while True:
line = yield m
m = line.split(',')
# test
r = f()
r.__next__()
r.send('123,321')
from collections import deque
class ActorScheduler:
def __init__(self):
# 字典保存actor
self._actors = {}
# 使用双向链表保存(actor, msg)
self._msg_queue = deque()
# 注册actor
def new_actor(self, name, actor):
self._msg_queue.append((actor, None))
self._actors[name] = actor
# actor发送给双向链表, 保存(actor, msg)
def send(self, name, msg):
actor = self._actors.get(name)
if actor:
self._msg_queue.append((actor, msg))
def run(self):
while self._msg_queue:
actor, msg = self._msg_queue.popleft()
try:
# 发送msg给对应actor. actor函数使用yield作为接收
actor.send(msg)
except StopIteration:
pass
def consumer():
while True:
# 接收run方法的send
n = yield
print('got:', n)
def producer(sched):
while True:
# 接收run方法的send
n = yield
# 退出
if n == 0:
break
# 发送给consumer
sched.send('consumer', n)
# 发送给自己, 自己也是consumer
sched.send('producer', n-1)
if __name__ == '__main__':
sched = ActorScheduler()
sched.new_actor('consumer', consumer())
sched.new_actor('producer', producer(sched))
sched.send('producer', 5)
sched.run()
输出
got: 5
got: 4
got: 3
got: 2
got: 1
- 两个index, 迭代两个对象
list1 = [1, 2, 3, 4]
list2 = [10, 20, 30, 40]
for x, y in zip(list1, list2):
print(x, y)
输出
1 10
2 20
3 30
4 40
- 只有单个index的时候, 输出元组
list1 = [1, 2, 3, 4]
list2 = [10, 20, 30, 40]
for i in zip(list1, list2):
print(i)
输出
(1, 10)
(2, 20)
(3, 30)
(4, 40)
-
switch case
匹配def http_error(status): match status: case 400: return "Bad request" case 401 | 403 | 404: return "Not allowed" case _: return "Something's wrong with the Internet"
-
参数匹配:
import sys match sys.argv[1:]: case 'add', k, v: print('add', k, v) case _: print('不合法输入')
- 输出
./test.py add name tz add name tz ./test.py abc 不合法输入
-
匹配对象:
- 1.数组
# 以下三种case的语义相同 match [1, 2 ,3]: case x, y, z: pass case [x, y, z]: pass case (x, y, z): pass # _省略单个元素, *_省略后面所有元素 match [1, 2 ,3]: case x, _, _: pass case x, *_,: pass
- 2.字典:
# 可以使用**kwargs match {'x': x ,'y': y}: case {'x': x, 'y': y}: pass case {'x': x, **kwargs}: pass case {**kwargs}: pass
- 3.class:
from dataclasses import dataclass class o: x: int = 1 y: int = 1 class o2: __match_args__ = ("x", "y") def __init__(self): self.x: int = 0 self.y: int = 0 @dataclass class o2: x: int = 0 y: int = 0 class o3: x: int = 0 y: int = 0 def f(subject): match subject: # 判断subject是不是o类, 以及x, y是否等于1 case o(x=1, y=1): print('1') # 需要定义__match_args__ 或者使用@dataclass case o2(x, y): print('2') # object表示duck type(任何class) case object(x=x, y=y): print('3') f(o()) f(o2()) f(o3())
- 输出
1 2 3
-
值匹配:
- 判断类的值
# o类来自上面例子 a = o() match (1, 1): case (a.x, a.y): pass match (1, 1): # case先执行, if后执行 case (a.x, a.y) if a.x > 0: pass
- 普通变量的值会被改写
x, y = 0, 0 match (1, 1): # 会改写为1 case (x, y): print(x, y)
- 输出
1 1
-
类型匹配:
dict1 = {'x': 1, 'y': 1.1}
match dict1:
# 判断x是否为int, y是否为float
case {'x': int(), 'y': float()}:
pass
lambda:
def mul_add(f, g):
def h(x):
return f(g(x, x), g(x, x))
return h
def mul_add(f, g):
return lambda x: f(g(x, x), g(x, x))
# test
test = mul_add(mul, add)
test(12)
- 字典的value放入函数.注意字典的value不能放入lambda
def plus(x):
return x + 1
func_list = {1: abs, 2: plus}
def wrapper(value):
return func_list[value]
# test
func = wrapper(1)
func(-1)
func = wrapper(2)
func(-1)
-
常用写法
items = [1, 2, 3, 4, 5]
squared = []
for i in items:
squared.append(i**2)
- 使用map
items = [1, 2, 3, 4, 5]
squared = list(map(lambda x: x**2, items))
输出
[1, 4, 9, 16, 25]
- map(函数)
def multiply(x):
return (x*x)
def add(x):
return (x+x)
funcs = [multiply, add]
for i in range(5):
value = list(map(lambda x: x(i), funcs))
print(value)
输出
[0, 0]
[1, 2]
[4, 4]
[9, 6]
[16, 8]
-
返回生成器
-
比
for
更快
number_list = range(-5, 5)
less_than_zero = list(filter(lambda x: x < 0, number_list))
print(less_than_zero)
输出
[-5, -4, -3, -2, -1]
- 常用写法
fib = 1
list1 = [1, 2, 3, 4]
for num in list1:
fib = product + num
- 使用reduce
from functools import reduce
fib = reduce((lambda x, y: x + y), [1, 2, 3, 4])
number_list = range(0, 11)
fib = reduce((lambda x, y: x + y), number_list)
number_max = reduce(lambda a, b:a if a > b else b, [1, 2, 3, 4])
- 用递归实现reduce
def listsum(list1):
if len(list1) == 1:
return list1[0]
else:
return list1[0] + listsum(list1[1:])
print(listsum([1, 3, 5, 7, 9]))
def reduce(f, list1):
if len(list1) == 1:
return list1[0]
else:
return f(reduce(f, list1[1:]), list1[0])
number_list = range(0, 11)
print(reduce(lambda x, y: x + y, number_list))
print(reduce(lambda x, y: x + y, [1, 3, 5, 7, 9]))
print(reduce(lambda a, b: a if a > b else b, [1, 2, 3, 4]))
def square(x):
return x * x
def successor(x):
return x + 1
def near(x, f, g):
return approx_eq(f(x), g(x))
def approx_eq(x, y, tolerance = 1e-5):
return abs(x - y) < tolerance
def golden_update(guess):
return 1 / guess + 1
def golden_test(guess):
return near(guess, square, successor)
def iter_improve(update, test, guess = 1):
while not test(guess):
guess = update(guess)
return guess
# test
# 1 / guess + 1
iter_improve(golden_update, golden_test)
def first(rlist):
return rlist[0]
def rest(rlist):
return rlist[1]
# insert
def insert(rlist, x):
return (rlist,x)
def finsert(rlist, x):
return (x,rlist)
# lengh
def lengh(rlist):
n = 0
while rlist != None:
rlist, n = rest(rlist), n + 1
return n
# test
rlist = (1, (1, (2, (2, None))))
lengh(rlist)
# get item
def get(rlist, n):
while n > 0:
rlist, n = rest(rlist), n - 1
return first(rlist)
# test
get(rlist, 2)
# nonone
def nonone(rlist):
if rest(rlist) == None:
return first(rlist)
return rlist
# reverse 反转
def reverse(rlist):
x, rlist = insert(first(rlist),None), rest(rlist)
while rlist != None:
x, rlist = insert(first(rlist),x), rest(rlist)
return x
# 递归
def test(rlist, x):
if rlist != None:
x = test(rest(rlist), (first(rlist),x))
return x
def reverse(rlist):
return test(rlist, None)
# test
reverse(rlist)
# insert
def ninsert(rlist, x, n):
len, y, rerlist = lengh(rlist) - n, None, reverse(rlist)
while rerlist != None:
y, rerlist = insert(first(rerlist),y), rest(rerlist)
len = len - 1
if len == 0:
y = insert(x, y)
return y
# test
ninsert(rlist, 0, 2)
def einsert(rlist, x):
return ninsert(rlist, x, lengh(rlist))
def einsert(rlist, x):
x = insert(0, None)
rerlist = reverse(rlist)
link(x, rerlist)
return x
# test
einsert(rlist, 0)
元组操作序列:
# count 计算一个值,在序列出现的次数
def count(rlist, x):
n = 0
while rlist != None:
if (x == first(rlist)):
n = n + 1
rlist = rest(rlist)
return n
# bug
def count(rlist, x):
n = 0
for i in rlist:
print(i)
if i == x:
n = n + 1
return n
count(rlist, 1)
-
一切皆是对象
-
每个对象由
id(地址)
type(类型)
value(值)
组成a is b
实际上为id(a) == id(b)
.is
效率高于==
-
list
dict
set
为可变数据,值的修改不需要创建新对象a = [1, [2], 3] b = a # 由于b引用a, 所以a和b一样 b[0] = 2 import copy # 浅复制不会复制所有子对象 b = copy.copy(a) # 父对象修改不会影响a b [0] = 2 # 子对象修改会影响a b [1][0] = 1 # 深复制, 两个对象完全不会影响 b = copy.deepcopy(a)
-
int
str
tuple
为不可变数据,值的修改需要创建新对象-
a = 256
b = 256
两者 id 相同python 维护一个(0, 256)的常量值, 这范围内的值的变量 id 相同
-
a = 257
b = 257
两者 id 不相同 -
a = 257
b = a
两者 id 相同 -
str1 = 'string'
-
str1.upper()
此时返回的是一个新字符串对象
-
-
-
python 访问变量, 函数, 模块时
-
1.首先会去查locals(), 这是个本地变量字典
-
2.如果没有就会去查globals(), 全局变量字典
-
-
类的特殊方法
a, b = 1, 2 a + b 等同于 a.__add__(b) list1 = [1, 2] list[0] 等同于 a.__getitem__(0)
-
字典的
key
,value
可以是其它类型-
注意:
key 不能为
list
,set
-
错误:
lv = {['hello', 'nihao']: 1}
lv = {{'hello', 'nihao'}: 1}
kv = {1: 'hello', 2: 'nihao'} kl = {1: ['hello', 'nihao']} kt = {1: ('hello', 'nihao')} ks = {1: {'hello', 'nihao'}} sv = {'hello': 1, 'nihao': 2} tv = {('hello', 'nihao'): 1}
-
-
list
,tuple
,set
转dict
dict([(3, 9), (4, 16), (5, 25)]) dict(([3, 9], [4, 16], [5, 25])) dict(({3, 9}, {4, 16}, {5, 25}))
-
dict
转list
,tuple
,set
只能保留key
:tuple({1: 'a', 2: 'b'}) list({1: 'a', 2: 'b'}) set({1: 'a', 2: 'b'})
-
要想同时保留
key
value
可以利用 list 保存 key,再循环赋值D = {'a':1, 'c':3, 'b':2} D1 = list(D.keys()) D1.sort() s = tuple() for i in D1: s = s + (i, D[i])
-
循环赋值
(x for x in range(1,5)) tuple(x * 2 for x in 'abc') [x for x in range(1,5) if x % 2 == 0] ['x' * 2 for x in 'abc'] {x: x * x for x in range(1,5)}
- join() :
list
,tuple
,dict
转str
tuple1 = ('hello', 'world')
' '.join(tuple1)
list1 = ['hello', 'world']
' '.join(list1)
# 只保留key
dict1 = {'hello': 1, 'world': 2}
' '.join(dict1)
-
strip()
-
rstrip() 只去除右边
-
lstrip() 只去除左边
# 去除空格符号 ' \t123\n '.strip() '####123####'.strip('#') '####123####'.strip('#13')
-
-
比较运算
str(10) > str(9) # False
-
startswith()
判断字符串开头
str1 = "my name is tz , age is 24"
x = str1.startswith('my')
print(x)
x = str1.startswith('name')
print(x)
# 查看name是否在第3个字符
x = str1.startswith('name', 3, 20)
print(x)
输出
True
False
True
- int转字符串 比较运算
str(10) > str(9)
False
-
字符的utf-8编码
-
变长编码: 一个字符最短是8位, 最长是32位
-
竟然是变长, 那如何区分一个字符的编码长度? 通过高位代表字符的长度
- 排除长度的位后, 剩余可用的位数为2^7 ; 2^11 ; 2^16 ; 2^21
First code point | Last code point | Byte 1 | Byte 2 | Byte 3 | Byte 4 |
---|---|---|---|---|---|
U+0000 | U+007F | 0xxxxxxx | |||
U+0080 | U+07FF | 110xxxxx | 10xxxxxx | ||
U+0800 | U+FFFF | 1110xxxx | 10xxxxxx | 10xxxxxx | |
U+10000 | [nb 2]U+10FFFF | 11110xxx | 10xxxxxx | 10xxxxxx | 10xxxxxx |
def string_to_bytes(s):
array = bytearray(s, "utf8")
list1 = [bin(i) for i in array]
print(list1)
# 中文需要3个字节
chinese = '一'
# 英文需要1个字节
english = 'a'
string_to_bytes(chinese)
string_to_bytes(english)
输出
['0b11100100', '0b10111000', '0b10000000']
['0b1100001']
- ord()查看字符编码
ord('1')
- sep: 分隔符, end: 末尾字符
print(1, 2, 3, 4) # 1 2 3 4
print(1, 2, 3, 4, sep=',') # 1,2,3,4
print(1, 2, 3, 4, sep=',', end='!!') # 1,2,3,4!!
- end 合并为一行
for i in range(3):
print(i)
for i in range(3):
print(i, end='')
输出
0
1
2
012
- 类型转换. 使用
*
代替join()
list1 = [1, 2, 3, 4]
# join写法
print(','.join(str(i) for i in list1)) # 1,2,3,4
# *写法
print(*list1, sep=',') # 1,2,3,4
-
插入变量
s = 'name: {name} age: {age}' s.format(name='tz', age=24) # name: tz age: 24
- vars()
name = 'tz' age = 24 s.format_map(vars()) # name: tz age: 24
- format_map() 将变量插入class
class people: def __init__(self, name, age): self.name = name self.age = age p = people('tz', 24) s.format_map(vars(p)) # name: tz age: 24
name = 'tz'
age = '24'
str1 = 'name %s age %s' % (name, age)
str1 = 'name ' + name + ' ' + 'age ' + age
str1 = f'name {name} age {age}'
str1 = f'name {0} age {1}'.format(name, age)
-
format()
# 对象 "my name is {a.name} , age is {a.age}".format(a=people()) # 小数保留 "{:.2f}".format(3.1415926)
-
使用
f'{v=}'
取代f"v = {v}"
pi = 3.14 print(f'{pi=}') pi=3.14
-
{}
>+
>%
>format
Template()
模板
from string import Template
str1 = Template("name $name age $age")
print(str1.substitute(name = 'tz', age = '24'))
-
使用操作文件方式, 来操作文本, 二进制字符串
- StringIO 和 BytesIO 并没有文件描述符
- 可以用于单元测试
-
io.StringIO(): 文本字符串
import io
s = io.StringIO()
# 写入
s.write('Hello World')
# print写入
print('Hello World', file=s)
# 读取
s.getvalue()
# 读取前4个字符
s.read(4)
-
io.BytesIO(): 文本字符串
- 不能使用print()写入
import io
s = io.BytesIO()
# 写入
s.write(b'Hello World')
# print写入
# 读取
s.getvalue()
# 读取前4个字符
s.read(4)
- 笛卡尔积:
list1 = ['A', 'B', 'C']
list2 = [1, 2, 3]
for i in list1:
for j in list2:
print((i, j))
输出
('A', 1)
('A', 2)
('A', 3)
('B', 1)
('B', 2)
('B', 3)
('C', 1)
('C', 2)
('C', 3)
- 列表推导式
list3 = [(i, j) for i in list1 for j in list2]
print(list3) # [('A', 1), ('A', 2), ('A', 3), ('B', 1), ('B', 2), ('B', 3), ('C', 1), ('C', 2), ('C', 3)]
-
list.append(): 包含类型
-
append自身(递归)
# [1, 2, 3, 4, 5]
list1 = list('12345')
list1.append(list1)
- 输出:
list1 == list1[5]
['1', '2', '3', '4', '5', [...]]
- list.extend() 合并list
# [1, 2, 3, 4, 5]
list1 = list('12345')
# 添加自身
list1.extend(list1)
- 输出:
list1
['1', '2', '3', '4', '5', '1', '2', '3', '4', '5']
- 语法糖
# [1, 2, 3, 4, 5]
list1 = list('12345')
# 间隔为2
list1[::2]
[1, 3, 5]
# 反向
list1[-2::]
[4, 5]
-
切割
a1, a2, a3 = [1, 2, 3] a1, a2, a3 = "123"
-
切割头, 中间, 尾
a1, *a2, a3 = "123456789"
- 输出
a1 '1' a2 ['2', '3', '4', '5', '6', '7', '8'] a3 '9'
-
切割头, 尾
list1 = [['name', 'tz', 'zt'], ['age', 24]] k, *v = list1 # 生成字典 {k:v for k, *v in list1}
- 输出
k ['name', 'tz', 'zt'] v [['age', 24]] {k:v for k, *v in list1} {'name': ['tz', 'zt'], 'age': [24]}
-
取出列表内的值
list1 = [1, 2, [3, 4], 5] [a, b, [c, d], e] = list1 # *_表示省略后面 [a, b, [c, *_], *_] = list1
- 输出
a 1 b 2 c 3
-
去重, 并保持原有列表顺序
-
直接set(), 并不能保持原有列表顺序
list1 = [1, 2, 2, 8, 1, 5, 3, 5] # set()会自动排序 set1=set(list1) # {1, 2, 3, 5, 8}
-
使用set() 作过滤
def dedupe(data): set1 = set() for i in data: if i not in set1: yield i set1.add(i) list1 = [1, 2, 2, 8, 1, 5, 3, 5] list(dedupe(list1)) # [1, 2, 8, 5, 3] ```
-
-
挑选列表内的int值
- compress() 返回生成器: 挑选列表内True的值
from itertools import compress list1 = [2, '8', 1, '9', '+', 3] # 输出生成器: [True, False, True, False, False, True] list2 = [isinstance(i, int) for i in list1] # 输出生成器: [2, 1, 3] list(compress(list1, list2))
-
挑选列表内的int值, 字符串内是int的值
def is_int(i): try: tmp = int(i) return True except ValueError: return False list1 = [2, '8', '-', 1, '9', '+', 3] # 输出生成器: [2, '8', 1, '9', 3] list(filter(is_int, list1))
- list实现
class Queue(object):
def __init__(self, maxsize=0):
self.list = []
self.maxsize = maxsize
self.size = 0
def put(self, data):
if self.maxsize <= 0 or self.size < self.maxsize:
self.size += 1
self.list.append(data)
else:
raise ValueError("size is max")
def get(self):
self.size -= 1
return self.list.pop(0)
def isempty(self):
return len(self.list) == 0
q = Queue(maxsize=0)
q.put(1)
q.put(2)
print(q.get())
print(q.get())
print(q.isempty())
q = queue(maxsize=1)
q.put(1)
try:
q.put(2)
except ValueError:
print('error')
- 使用两个stack实现queue
class Queue:
def __init__(self):
self.stack1 = []
self.stack2 = []
def put(self, data):
self.stack1.append(data)
def get(self):
while self.stack1:
self.stack2.append(self.stack1.pop())
return self.stack2.pop()
- Queue: FIFO
from queue import Queue
# maxsize 队列限制, 小于或等于0, 表示无限制
s = Queue(maxsize=0)
s.put("1")
s.put("2")
s.put("3")
# 队列长度
s.qsize()
s.get()
s.get()
s.get()
# 队列是否为空
s.empty()
# 队列是否满
s.full()
- LifoQueue: LIFO等同于stack
from queue import LifoQueue
s = LifoQueue()
s.put("1")
s.put("2")
s.put("3")
s.get()
s.get()
s.get()
- 括号匹配
from queue import LifoQueue
def matches(open, close):
opens = "([{"
closers = ")]}"
return opens.index(open) == closers.index(close)
def parChecker(str1):
stack1 = LifoQueue()
for i in range(len(str1)):
symbol = str1[i]
# 如果是左括号
if symbol in "([{":
stack1.put(symbol)
# 如果是右括号
elif symbol in ")]}":
top = stack1.get()
if not matches(top, symbol):
return False
if stack1.empty():
return True
else:
return False
print(parChecker("{{([][])}()}"))
print(parChecker("[{()]"))
class CircularQueue():
def __init__(self, maxsize):
self.maxsize = maxsize
self.queue = [None] * maxsize
# -1表示队列为空
self.head = self.tail = -1
def enqueue(self, data):
if (self.tail + 1) % self.maxsize == self.head:
print("The circular queue is full\n")
# 第一次添加元素
elif self.head == -1:
self.head = 0
self.tail = 0
self.queue[self.tail] = data
else:
self.tail = (self.tail + 1) % self.maxsize
self.queue[self.tail] = data
def dequeue(self):
if self.head == -1:
print("The circular queue is empty\n")
# 当队列只剩一个元素时
elif self.head == self.tail:
temp = self.queue[self.head]
self.head = self.tail = -1
return temp
else:
temp = self.queue[self.head]
self.head = (self.head + 1) % self.maxsize
return temp
def print(self):
if self.head == -1:
print("No element in the circular queue")
else:
for i in range(self.head, self.tail + 1):
print(self.queue[i], end=" ")
print()
q = CircularQueue(5)
# 加入队列
for i in range(5):
q.enqueue(i)
q.print()
# 出队列
q.dequeue()
q.print()
-
Deque 支持FIFO, LIFO
-
两边的元素append()和pop()的时间复杂度是: O(1)
- 但随机访问的中间元素是: O(n)
-
list实现
class Deque:
def __init__(self):
self.items = []
def isEmpty(self):
return self.items == []
def appendright(self, item):
self.items.append(item)
def appendleft(self, item):
self.items.insert(0, item)
def popleft(self):
return self.items.pop(0)
def pop(self):
return self.items.pop()
d = Deque()
d.appendright(8)
d.appendright(5)
d.appendleft(7)
d.appendleft(10)
print(d.items)
d.popright()
d.popleft()
print(d.items)
- deque()
from collections import deque
de = deque([1, 2, 3])
de.append(4)
print(de)
# 左边添加
de.appendleft(0)
print(de)
de.pop()
# 左边移除
de.popleft()
print(de)
输出
deque([1, 2, 3, 4])
deque([0, 1, 2, 3, 4])
deque([1, 2, 3])
- maxlen: 维护一个固定长度, 新的元素会挤掉旧的
de = deque(maxlen=3)
de.append(1)
de.append(2)
de.append(3)
print(de)
de.append(4)
print(de)
输出
deque([1, 2, 3], maxlen=3)
deque([2, 3, 4], maxlen=3)
-
- 但timeit的测试结果是list比deque快1.68倍
- 连续内存
from array import array
# i 表示int类型
array1 = array('i', range(3))
# d 表示float类型
array1 = array('d', [1.1, 2.2, 3.3])
- 通过插入并排序的时间复杂度O(n log n), 来维持查找的时间复杂度O(log(n))
import bisect
list1 = [1, 3, 4, 4, 4, 6, 7]
# 查找元素, 返回最右. 时间复杂度O(log(n))
print(bisect.bisect(list1, 4))
# 返回最左
print(bisect.bisect_left(list1, 4))
# 返回最右
print(bisect.bisect_right(list1, 4))
# 插入元素并排序. 时间复杂度O(n log n)
bisect.insort(list1, 5)
print(list1)
# 在从0开始, 在第4个元素插入元素5
bisect.insort_right(list1, 5, 0, 4)
print(list1)
- append()和pop()的时间复杂度是: O(log n)
import heapq
q = []
heapq.heappush(q, (1, "a"))
heapq.heappush(q, (3, "b"))
heapq.heappush(q, (2, "c"))
while q:
print(heapq.heappop(q))
- heapify()找到列表里最小的值. 时间复杂度是: O(log n)
list1 = [4, 2, 1, 6, 4, 7, 5]
heapq.heapify(list1)
print(list1)
print(list1[0])
print(heapq.heappop(list1))
输出
[1, 2, 4, 6, 4, 7, 5]
1
1
import heapq
class PriorityQueue:
def __init__(self):
self.queue = []
self._index = 0
def push(self, data, priority):
# 最小的数会先pop, 因此要加负数priority
heapq.heappush(self.queue, (-priority, self._index, data))
self._index += 1
def pop(self):
# 返回(-3, 1, 'b'), 所以要加[-1]
return heapq.heappop(self.queue)[-1]
q = PriorityQueue()
q.push('a', 1)
q.push('b', 3)
q.push('c', 2)
q.push('d', 3)
while q.queue:
print(q.pop())
-
list: 动态数组
-
一般保存相同类型的数据
-
会多分配一些内存, 方便日后append()
-
由于是动态, 因此无法hash
list1 = [i for i in range(10)] # 报错 set1 = set() set1.add(list1) # 报错, list只能用作value dict1 = {list1: '1'}
- 自定义hash
class List(list): # id值 def __hash__(self): return hash(id(self)) list1 = List(i for i in range(10)) # 成功添加 set1 = set() set1.add(list1) dict1 = {list1: '1'} list1 in set1 # True list1 in dict1 # True
- 自定义hash
-
-
tuple: 静态数组
- 一般保存不同类型的数据
# tuple需要加,
(1) # int
(1,) # tuple
word = "hello Worrld ! in Python"
# 字符串转元组
tuple(word.split())
tuple(w[0] for w in word.split())
tuple(w[0] for w in word.split() if w[0].isupper())
tuple(w[0] for w in word.split() if w[0].islower())
def first(list):
return list[0]
def iscap(word):
return word[0].isupper()
def acronym(word, f):
return tuple(map(f, filter(iscap, word.split())))
def acronym1(word, f):
return tuple(f(w) for w in word.split() if iscap(w))
# 提取首字母为大写的单词
acronym(word, lambda x: x)
# 提取首字母为大写的字母
acronym(word, lambda x: x[0])
# 转换为小写
acronym(word, lambda x: x[0].lower())
def insert(s, x):
s = s + x
s = s + ' '
return s
# insert
def ninsert(n, y, x):
l, s = 0, ''
for i in n:
s = insert(s, i)
l = l + 1
if l == y:
s = insert(s, x)
return tuple(s.split())
n = ('hello', 'Worrld', '!', 'in', 'Python')
ninsert(n, 3, 'test')
- 第一个元素可以比较大小. 列表也一样可以
a = (1, 'a')
b = (2, 'b')
print(a < b)
print(a > b)
- 可以代替字典, 有着比字典更小的空间
from collections import namedtuple
people = namedtuple('people', ('name', 'age'))
# or
people = namedtuple('people', 'name age')
a = people('tz', 24)
print(a.name, a.age)
print(a[0], a[1])
# 查看字段
print (people._fields)
# 字典输出
print(a._asdict())
输出
tz 24
tz 24
('name', 'age')
{'name': 'tz', 'age': 24}
_make()
列表或元组输入
from collections import namedtuple
people = namedtuple('people', ('name', 'age'))
# 列表输入
list1 = ['tz', 24]
print(people._make(list1))
# 元组输入
tuple1 = ('tz', 24)
print(people._make(tuple1))
输出
people(name='tz', age=24)
people(name='tz', age=24)
_replace()
修改value. 会创建一个新的实例
# 报错: AttributeError: can't set attribute
a.age = 20
# 会创建一个新的实例
a = a._replace(age=20)
- 定义函数, 输入字典, 将nametuple的实例, 修改字典值
from collections import namedtuple
def replace(dict1):
return a._replace(**dict1)
people = namedtuple('people', ('name', 'age'))
a = people('tz', 24)
dict1 = {'name': 'tz', 'age': 21}
# 输出: people(name='tz', age=21)
a = replace(dict1)
- 初始化
# 两者相同
dict1 = dict(a=1, b=2)
dict1 = {'a': 1, 'b': 2}
- fromkey()提取key
dict1 = {'a': 1, 'b': 2}
dict2 = dict1.fromkeys('b', 3)
print(dict2) # {'b': 3}
- get(), setdefault()获取key
dict1 = {'a': 1}
# 如果没有这个key, 就返回0
dict1.get('b', 0)
# 如果没有这个key, 就设置这个key, value
dict1.setdefault('b', 0)
dict1 = {}
dict1.setdefault('a', []).append(1)
dict1.setdefault('a', []).append(2)
print(dict1) # {'a': [1, 2]}
- defaultdict()
from collections import defaultdict # 初始化为列表 dict1 = defaultdict(list) dict1['a'].append(1) dict1['a'].append(2) dict1 # {'a': [1, 2]}
- key不能重复
# 重复的key, 等同于更新操作
dict1 = {'x': 1,'x':2} # {'x': 2}
-
去重key,value. 通过set()作过滤器
def dedupe(data, func=None): set1 = set() for kv in data: val = kv if func is None else func(kv) print(val) if val not in set1: yield kv set1.add(val) list1 = [{'x': 1, 'y': 2}, {'x': 3, 'y': 4}, {'x': 1, 'y': 2}, {'x': 1, 'y': 3}] # 去重key, value. 将value保存进set list(dedupe(list1, func=lambda d: d['x'])) list(dedupe(list1, func=lambda d: (d['x'], d['y'])))
输出
[{'x': 1, 'y': 2}, {'x': 3, 'y': 4}] [{'x': 1, 'y': 2}, {'x': 3, 'y': 4}, {'x': 1, 'y': 3}]
-
去重文件重复行
with open('/tmp/test', 'r') as file: for line in dedupe(file): print(line.strip('\n'))
-
通过转换kv, 去除重复value
test_dict = {'a': 10, 'b': 15, 'c': 20, 'd': 10, 'e': 20}
temp = {val: key for key, val in test_dict.items()}
res = {val: key for key, val in temp.items()}
print(res)
输出
{'d': 10, 'b': 15, 'e': 20}
dict(zip('abc', range(3))) # {'a': 0, 'b': 1, 'c': 2}
-
对value进行排序
dict1 = { 'o': 3, 'y': 2, 'x': 1, 'z': 3 } # 交换kv dict1_zip = zip(dict1.values(), dict1.keys()) print(sorted(dict1_zip)) # 相当于循环 print
输出
[(1, 'x'), (2, 'y'), (3, 'o'), (3, 'z')]
-
反转key, value
dict1_zip = zip(dict1.values(), dict1.keys()) dict1_reverse = {key: value for key, value in dict1_zip} print(dict1_reverse) # {3: 'z', 2: 'y', 1: 'x'}
-
两个数组转换为字典
k = ['k1', 'k2']
v = ['v1', 'v2']
kv = dict(zip(k,v))
# 通过切片交换数组, 形成字典
old_kv = ['k1', 'v1', 'k2', 'v2']
old_kv1 = ['k10', 'v10', 'k20', 'v20']
new_kv = dict(zip(old_kv[0::2], old_kv1[0::2]))
- 输出
kv
{'k1': 'v1', 'k2': 'v2'}
new_kv
{'k1': 'k10', 'k2': 'k20'}
dict1 = {'k1': 'v1', 'k2': 'v2'}
(k1, v1), (k2, v2) = dict1.items()
(i1, i2) = dict1.items()
# *_表示省略后面
(k1, v1), *_ = dict1.items()
# _表示省略
(i1, _) = dict1.items()
- 输出
k1
'k1'
v1
'v1'
i1
('k1', 'v1')
i2
('k2', 'v2')
-
keys支持集合运算, values不支持, items带keys所以支持
dict1 = { 'x': 1, 'y': 2, 'z': 3 } dict2 = { 'x': 1, 'y': 20, 'o': 3 } # & 交集 dict1.keys() & dict2.keys() # {'x', 'y'} dict1.items() & dict2.items() # {('x', 1)} # | 并集 dict1.keys() | dict2.keys() # {'o', 'x', 'y', 'z'} # item升级为元组类型 dict1.items() | dict2.items() # {('o', 3), ('x', 1), ('y', 2), ('y', 20), ('z', 3)} # - 差集 dict1.keys() - dict2.keys() # {'z'} dict2.keys() - dict1.keys() # {'o'} # ^ 对称差集. 先交集的后补集 dict1.keys() ^ dict2.keys() # {'o', 'z'} dict1.items() ^ dict2.items() # {('o', 3), ('y', 2), ('y', 20), ('z', 3)}
v | b
instead of{**v, **v1}
instead ofv.update(b)
{**a, **b}
# or
a | b
|=
insteaddict.update()
dict1 = {'a': 1, 'b': 2}
dict2 = {'c': 3, 'd': 4}
# 以下三种方法一样 # {'a': 1, 'b': 2, 'c': 3, 'd': 4}
dict1.update(dict2)
dict1 = {**dict1, **dict2}
dict1 |= dict2
from collections import Counter
d = Counter()
# 不存在的key, 返回0
d['a'] # 0
d['a'] += 1 # Counter({'a': 1})
d['a'] += 1 # Counter({'a': 2})
d['b'] = 10 # Counter({'a': 2, 'b': 10})
# 添加
d.update({'a': 10, 'b': 10, 'c': 10}) # Counter({'a': 12, 'b': 20, 'c': 10})
# 返回列表
d.most_common() # [('b', 20), ('a', 12), ('c', 10)]
# elements()返回迭代器
list(d.elements()) # ['a', 'a', ......., 'c']
- 集合运算
d1 = Counter('abc')
d2 = Counter('abd')
d1 + d2 # Counter({'a': 2, 'b': 2, 'c': 1, 'd': 1})
d1 - d2 # Counter({'c': 1})
d1 & d2 # Counter({'a': 1, 'b': 1})
d1 | d2 # Counter({'a': 1, 'b': 1, 'c': 1, 'd': 1})
-
统计列表重复的值
from collections import Counter print(Counter(['B','B','A','B','C','A','B', 'B','A','C']))
输出
Counter({'B': 5, 'A': 3, 'C': 2})
-
统计命令的次数
from collections import Counter import pprint cmd = [] # 将所有命令加入list with open('/home/tz/.bash_history', 'r') as f: for line in f: l = line.split() if len(l) > 1: cmd.append(l[0]) # Counter用dict统计list重复的值, 并按顺序排序 pprint.pprint(Counter(cmd))
-
通过lambda设置默认值为1
from collections import defaultdict dict1 = defaultdict(lambda: 1) dict1['a'] # 1 dict1['b'] # 1
-
通过设置list类型, 实现一个key多个value
from collections import defaultdict # 定义value为list d = defaultdict(list) for i in range(5): d[i].append(i) print(d) # defaultdict(<class 'list'>, {0: [0], 1: [1], 2: [2], 3: [3], 4: [4]})
-
通过设置int类型, 进行统计
from collections import defaultdict # 定义int d = defaultdict(int) list1 = [1, 2, 3, 4, 2, 4, 1, 2] for i in list1: d[i] += 1 print(d) # defaultdict(<class 'int'>, {1: 2, 2: 3, 3: 1, 4: 2})
-
统计命令的次数
from collections import defaultdict import pprint # 将所有命令加入list cmd = [] with open('/home/tz/.bash_history', 'r') as f: for line in f: l = line.split() if len(l) > 1: cmd.append(l[0]) # 定义int d = defaultdict(int) for i in cmd: d[i] += 1 # 不会按顺序排序 pprint.pprint(d)
from collections import OrderedDict
d = OrderedDict()
d['a'] = 1
d['b'] = 2
d['c'] = 3
for key in d:
print(key, d[key])
# 将b移动到末尾
d.move_to_end('b')
# 将b移动到头部
d.move_to_end('b', last=False)
from collections import ChainMap
dict1 = {'x': 0, 'y': 1}
dict2 = {'x': 2, 'z': 3}
c = ChainMap(dict1, dict2)
# 当出现重复时, 只会输出第一个
c['x'] # 0
list(c.keys()) # ['x', 'z', 'y']
list(c.values()) # [0, 3, 1]
- new_child() 手动创建
c = ChainMap()
c['x'] = 0
c['y'] = 1
c = c.new_child()
c['x'] = 2
c['z'] = 3
# 输出: ChainMap({'x': 2, 'z': 3}, {'x': 0, 'y': 1})
print(c)
from types import MappingProxyType
dict1 = {'a': 1}
dict1_proxy = MappingProxyType(dict1)
# 报错
dict1_proxy['a'] = 2
-
比list使用更多的内存, 但查询速度更快
-
匹配元素重复2次以上
list1 = ['a', 'b', 'c', 'b', 'd', 'm', 'n', 'n'] set1 = set([x for x in list1 if list1.count(x) > 1]) print(set1)
输出
{'n', 'b'}
-
intersection()
: 交集list1 = range(5) set1 = set([1, 6]) print(set1.intersection(list1))
输出
{1}
-
intersection()
: 差值list1 = range(5) set1 = set([1, 6]) print(set1.difference(list1))
输出
{6}
-
frozenset(): 去重
list1 = [1, 1, 2, 2, 3] for i in frozenset(list1): # or # for i in set(list1): print(i)
输出
1 2 3
tuple1 = ('a', 'b', 1, 2)
fset1 = frozenset(tuple1)
# isdisjoint() 是否没有交集
tuple2 = (3, 4)
fset1.isdisjoint(tuple2) # True
# issuperset() 是否是父集
tuple3 = (1, 2)
fset1.issuperset(tuple3) # True
# issubset() 是否是子集
tuple4 = ('a', 'b', 1, 2, 3, 4)
fset1.issubset(tuple4) # True
-
减少内存
-
DAWG(有向无环图)
-
共享前缀, 后缀
-
不能修改
import dawg str1 = 'hello world' dawg1 = dawg.DAWG(str1)
-
-
trie只共享前缀
-
marisa_trie
- 不能修改
import marisa_trie trie = marisa_trie.Trie(str1)
-
datrie
- 可以修改
import datrie set1 = set(str1) datrie1 = datrie.BaseTrie(set1)
-
-
将代码保存文件后, 使用
mypy
进行静态类型检查mypy ./test.py # python2 mypy --py2 ./test.py
-
以下代码的报错是指静态类型检查阶段
def add(x: int, y: int):
return x + y
# 报错 不是int
add('1', '2')
def test(n: int) -> int:
return n
test(1)
# 报错 输入值不是int
test(1.1)
def test1(n: int) -> int:
# 报错 返回值不是int
n = 1.1
return n
test1(1)
from typing import List
def test(names: List[str]) -> None:
print(names)
names = ["Alice", "Bob", "Charlie"]
ages = [10, 20, 30]
test(names)
# 报错 输入值不是list
test('123')
-
自动生成
pyi
文件-
相当于解耦
-
1.源码:
class o: def __init__(self, x): self.x = []
-
2.生成
pyi
文件. 文件保存在.pytype/pyi/
下pytype test.py
-
3.
pyi
文件:from typing import List class o: x: List[nothing] def __init__(self, x) -> None: ...
-
-
什么时候需要静态类型检查:
- 1.sdk, 库, 接口给别人的时候
- 2.大代码量
- 3.单元测试
-
PEP484: 分两个阶段
-
1.静态检查阶段
-
2.运行时阶段
-
-
TYPE_CHECKING
只在静态检查阶段导入库from typing import TYPE_CHECKING # 静态检查阶段 if TYPE_CHECKING: import requests
-
[]
定义类型:# 定义元组内的类型, 第一个必须是int, 第二个必须是str n = tuple[int, str] def f() -> n: return (1, '1') # 报错 第二个元素不是str类型 def f() -> n: return (1, 1) # 报错 多出一个元素 def f() -> n: return (1, '1', 1)
Literal
定义一组值
from typing import Literal # 定义r, w mode = Literal['r', 'w'] def myopen(path: str, m: mode) ->None: pass # 不报错 myopen('/tmp/test', 'r') # 报错 rb不在mode列表内 myopen('/tmp/test', 'rb')
Union
定义一组类型, 类型可以不同
from typing import Union n = Union[int, str] # 不会报错 def f() -> n: return 1 # 不会报错 def f() -> n: return '1'
TypeVar
定义一组类型(泛型), 类型必须相同
from typing import TypeVar n = TypeVar('n', int, str) def f(x: n, y: n): pass # 没有报错 f(1, 1) # 没有报错 f('1', '1') # 报错 两个参数的类型必须相同 f('1', 1)
Iterable
定义迭代器
from typing import TypeVar, Iterable n = TypeVar('n', int, float) # 定义迭代器 n1 = Iterable[tuple[n, n]] def f(v: n1[n]): pass # 不会报错 f( ((1, 1), (2, 2)) ) # 不会报错 f( ((1, 1.1), (2.1, 2)) ) # 报错 f( (1, 1.1) )
- 类继承
Generic
, 使类下的方法与TypeVar
相同
from typing import TypeVar, Generic # 初始化泛型 n = TypeVar('n') # 继承泛型 class o(Generic[n]): def f(self, x: n): pass # 定义泛型为int a: o[int] = o() # 不会报错 a.f(1) # 报错 参数只能是泛型n, 也就是int类型 a.f('1')
-
@no_type_check
关闭静态类型检查:from typing import no_type_check # 不会报错 @no_type_check def f(x: int) -> int: return x + 1 f(1.1)
-
Final
变量不能修改from typing import Final x: Final = 1 # 报错, 变量不能修改 x += 1
-
@final
类, 方法不能继承和重写- 类
from typing import final @final class o: pass # 报错, 不能继承 class o1(o): pass # 报错, 不能重写 class o: pass
- 方法:
class o: @final def f(self): pass # 报错, 不能重写 def f(self): pass
-
@dataclass
将类变为数据结构from dataclasses import dataclass @dataclass class o: x: int y: int a = o(1, 2)
from dataclasses import dataclass from typing import List @dataclass class o: x: int y: int @dataclass class o1: mylist: List[o] b = o1([o(1, 2), o(3, 4)]) print(b.mylist) # 获取列表中, 第一个o对象中的, 第一个元素 print(b.mylist[0].x)
-
Protocol
实现 duck typing(任何class). 不检查类型, 而是检查方法, 属性是否存在
from typing import Protocol, Iterable
class iresource(Protocol):
def close(self) -> None:
pass
# 并不需要继承iresource
class resource():
def close(self) -> None:
pass
# close所有列表对象
def close_all(r: Iterable[iresource]):
for i in r:
i.close()
f = open('/tmp/test')
r = resource()
# 不管是什么class, 都可以close()
close_all([f, r])
- sorted的key参数支持数据类型
list1 = [10, '21', 9, '1', 33, 62, '4', 31]
# 直接调用会报错
print(sorted(list1))
# 指定key
print(sorted(list1, key=int)) # ['1', '4', 9, 10, '21', 31, 33, 62]
print(sorted(list1, key=str)) # ['1', 10, '21', 31, 33, '4', 62, 9]
-
找出列表内元素组合的最大值. 通过字符串的比较, 使其变成一个排序问题
- 普通方法
list1 = [10, 9, 33, 62, 31] list2 = [str(i) for i in list1] print(''.join(sorted(list2, reverse=True))) # 962333110
- 通过cmp_to_key(), 让sorted的key参数(指定排序规则)接受2个值
from functools import cmp_to_key # 只能是int(y+x)-int(x+y), 而不能是 int(x+y)-int(x+y) list2 = sorted(list2, key=cmp_to_key(lambda x, y:int(y+x)-int(x+y))) print(''.join(list2))
- 普通方法
-
dict内的value排序
- lambda方法
users = [ {'name': 'user0', 'age': 24, 'uid': 1000}, {'name': 'user1', 'age': 22, 'uid': 1002}, {'name': 'user2', 'age': 22, 'uid': 1001}, {'name': 'user3', 'age': 21, 'uid': 1003} ] # 排序age print(sorted(users, key=lambda x: x['age'])) # 先排序age, 再排序uid print(sorted(users, key=lambda x: (x['age'], x['uid'])))
- itemgetter()
from operator import itemgetter print(sorted(users, key=itemgetter('age'))) print(sorted(users, key=itemgetter('age', 'uid')))
- lambda方法
-
对dict的val进行分组
- groupy()
输出
users = [ {'name': 'user0', 'age': 24, 'scores': 60}, {'name': 'user1', 'age': 22, 'scores': 100}, {'name': 'user2', 'age': 22, 'scores': 50}, {'name': 'user3', 'age': 21, 'scores': 60} ] from itertools import groupby from operator import itemgetter # 排序 users.sort(key=itemgetter('scores')) for k, items in groupby(users, key=itemgetter('scores')): print(k) for i in items: print(i)
50 {'name': 'user2', 'age': 22, 'scores': 50} 60 {'name': 'user0', 'age': 24, 'scores': 60} {'name': 'user3', 'age': 21, 'scores': 60} 100 {'name': 'user1', 'age': 22, 'scores': 100}
- defaultdict()
输出
from collections import defaultdict dict1 = defaultdict(list) for i in users: dict1[i['scores']].append(i) for i in dict1[60]: print(i)
{'name': 'user0', 'age': 24, 'scores': 60} {'name': 'user3', 'age': 21, 'scores': 60}
- groupy()
-
class内属性排序
- lambda方法
class User: def __init__(self, id): self.uid = id def __repr__(self): return f'{self.uid}' users = [User(3), User(2), User(1)] sorted(users, key=lambda u: u.uid) # [1 2 3]
- itemgetter()
from operator import attrgetter sorted(users, key=attrgetter('uid')) # [1 2 3]
- lambda方法
- 返回多个值
def f():
return 1, 2, 3
tuple1 = f()
a, b, c = f()
- lambda: 匿名函数
# x是自由变量, 运行时绑定, 而不是定义时绑定
x = 10
f = lambda y: x + y
# 30
f(20)
- 推导式的自由变量
# 错误, n没有绑定, 在这里会一直等于4
funcs = [lambda x: x+n for n in range(5)]
# 绑定n
funcs = [lambda x, n=n: x+n for n in range(5)]
for f in funcs:
print(f(0))
- 回调函数
- 函数, 类, 协程. 三个例子
def closure():
n = 0
# 闭包函数
def func():
print('n=', n)
# nonlocal可以修改函数内部变量
def set_n(value):
nonlocal n
n = value
func.set_n = set_n
return func
# f = func()
f = closure()
f() # 0
f.set_n(1)
f() # 1
*argv
: 表示剩下的元组元素
def myFun(arg1, arg2, *argv):
print ("First argument :", arg1)
print ("Second argument :", arg2)
for arg in argv:
print("Next argument through *argv :", arg)
myFun('Hello', 'Welcome', 'to', 'python')
# 或者
tuple1 = ('Hello', 'Welcome', 'to', 'python')
myFun(*tuple1)
**kwargs
: 表示剩下的字典kv
def myFun(arg1, arg2, **kwargs):
print ("First argument :", arg1)
print ("Second argument :", arg2)
for key, value in kwargs.items():
print ("%s == %s" %(key, value))
myFun('hello', 'tz', name = 'tz', age = '24')
# 或者
dict1 = {'name' : 'tz', 'age' : '24'}
myFun('hello', 'tz', **dict1)
# 普通函数和匿名函数
def a (x, y):
"""This is the module docstring."""
c = {'a': 1, 'b': 2}
return x + y
b = lambda x, y: x + y
# dir查看方法
dir(a)
dir(b)
# __doc__返回 """This is the module docstring."""
a.__doc__
# 调用是使用内置的方法__call__()
a.__call__(1, 2)
b.__call__(1, 2)
a(1, 2)
b(1, 2)
# __dict__查看内部使用字典保存a.name, a.age属性变量
def test():
test.a = 1
# 一开始函数没有运行,此时还没有赋值,结果为空
test.__dict__
# 函数运行后可查看
test()
test.__dict__
- inspect.signature(): 获取形式参数
from inspect import signature
def f(x, y, z=1):
pass
args = signature(f)
print(args) # (x, y, z=1)
print(args.parameters) # OrderedDict([('x', <Parameter "x">), ('y', <Parameter "y">), ('z', <Parameter "z=1">)])
print(args.parameters['z'].default) # 1
输入输出函数
def wrapper(func):
def newfunc(*args, **kw):
print("我真的是一个装饰器")
return func
return newfunc
@wrapper
def func():
print("我是原函数")
# @的表达式.个人理解为面向对象遇上函数的表达方法.等同于以下
func = wrapper(func)
# test
# 这里执行的是newfunc
func()
# 执行newfunc后会返回func函数,第二个()就是执行func
func()()
# 再包一层
def wrapper1(func):
def newfunc(*args, **kw):
print("我真的是第二个装饰器")
return func
return newfunc
def wrapper(func):
def newfunc(*args, **kw):
print("我真的是一个装饰器")
return func
return newfunc
@wrapper1
@wrapper
def func():
print("我是原函数")
# 以上等同于
func = wrapper1(wrapper(func))
# test
func()()()
- functools.wraps(): 传递参数, 复制元信息(name, doc)
import time
from functools import wraps
def timethis(func):
@wraps(func)
def wrapper(*args, **kwargs):
start = time.time()
result = func(*args, **kwargs)
end = time.time()
print(func.__name__, end-start)
return result
return wrapper
@timethis
def countdown(n):
while n > 0:
n -= 1
countdown(100000)
# __wrapped__获取原来的函数
origin_func = countdown.__wrapped__
origin_func(10)
- 装饰类
def wrapper(cls):
class newcls:
def __init__(self):
self.name = "我真的是一个装饰器"
self.cls = cls
return newcls
@wrapper
class cls:
def __init__(self):
self.name = "我是原函数"
# test
a = cls()
a.name
b = a.cls()
b.name
- 装饰类的方法
from functools import wraps
def wrapper(func):
@wraps(func)
def f(*args, **kwargs):
print('我是装饰器')
return
return f
class O:
@classmethod
@wrapper
def f(cls):
print('in classmethod')
@staticmethod
@wrapper
def f1():
print('in staticmethod')
# test classmethod
O.f() # 我是装饰器
s = O()
# test staticmethod
s.f1() # 我是装饰器
- 根据函数是否存在某参数, 从而执行相应的操作
from functools import wraps
import inspect
def optional_add(func):
# 查看函数是否存在arg_, 就输出y参数
if 'arg_y' in inspect.getfullargspec(func).args:
print('not arg_y')
@wraps(func)
def wrapper(*args, **kwargs):
return func(*args, **kwargs)
return wrapper
@optional_add
def f(x):
pass
@optional_add
def f1(x, arg_y):
pass
f1(1, 2)
f(1) # not arg_y
- 手动实例化
from functools import wraps
class O:
# Decorator as an instance method
def wrapper(self, func):
@wraps(func)
def wrapper(*args, **kwargs):
print('我是装饰器')
return func(*args, **kwargs)
return wrapper
# 实例化
o = O()
@o.wrapper
def f():
pass
f() # 我是装饰器
- 不需要手动实例化
import types
from functools import wraps
class Wrapper:
def __init__(self, func):
wraps(func)(self)
self.ncalls = 0
def __call__(self, *args, **kwargs):
self.ncalls += 1
print('我是装饰器', self.ncalls)
return self.__wrapped__(*args, **kwargs)
def __get__(self, instance, cls):
if instance is None:
return self
else:
return types.MethodType(self, instance)
def f1(self):
print('我是f1函数')
@Wrapper
def f():
pass
f() # 我是装饰器 1
f() # 我是装饰器 2
f.f1() # 我是f1函数
- partial() 封装一个参数
from functools import partial
def p(a, b):
print(a, b)
p1 = partial(p, b = 3)
p1(2) # 2 3
- cmp_to_key()
# 函数转为key,然后进行排序:
from functools import cmp_to_key
def cmp_fun(a, b):
if a[-1] > b[-1]:
return 1
elif a[-1] < b[-1]:
return -1
else:
return 0
list1 = ['9', '2', '7']
l = sorted(list1, key = cmp_to_key(cmp_fun))
print('sorted list :', l)
- lru_cache(): 保存最新的调用, 减少内存
from functools import lru_cache
@lru_cache(maxsize = None)
def fib(n):
if n == 0:
return n
return n + fib(n-1)
print([fib(n) for n in range(7)]) # [0, 1, 3, 6, 10, 15, 21]
print(fib.cache_info()) # CacheInfo(hits=6, misses=7, maxsize=None, currsize=7)
# 清空缓存
fib.cache_clear()
- singledispatch(): 根据输入参数的类型, 使用不同的函数
from functools import singledispatch
@singledispatch
def fun(s):
print(s)
@fun.register(int)
def _(s):
print(s * 2)
@fun.register(float)
def _(s):
print(s - 1)
fun('GeeksforGeeks')
fun(10)
fun(10.0)
输出
test
20
9.0
__
私有不能被继承;_
可以被继承
class A():
def __f(self):
print('in A')
class B(A):
def __init__(self):
super().__f()
# 报错
B()
-
super(): 调用父类方法
-
通过
__mro__
列表, 从左到右查找基类, 每个方法也只会被调用一次- mro列表: 是使用C3(反向广度优先算法)实现
-
-
查看mro列表
int.__mro__
# (int, object)
- 不使用super(), 调用父类方法
class Base:
def __init__(self):
print('Base.__init__')
class A(Base):
def __init__(self):
Base.__init__(self)
print('A.__init__')
class B(Base):
def __init__(self):
Base.__init__(self)
print('B.__init__')
class C(A,B):
def __init__(self):
A.__init__(self)
B.__init__(self)
print('C.__init__')
c = C()
输出: 会调用两次Base
Base.__init__
A.__init__
Base.__init__
B.__init__
C.__init__
- 使用super(), 调用父类方法
class Base:
def __init__(self):
print('Base.__init__')
class A(Base):
def __init__(self):
super().__init__()
print('A.__init__')
class B(Base):
def __init__(self):
super().__init__()
print('B.__init__')
class C(A,B):
def __init__(self):
super().__init__() # Only one call to super() here
print('C.__init__')
c = C()
输出: 只调用一次Base
Base.__init__
B.__init__
A.__init__
C.__init__
- 查看mro列表
C.__mro__
# (<class '__main__.C'>, <class '__main__.A'>, <class '__main__.B'>, <class '__main__.Base'>, <class 'object'>)
-
深度优先: M5 -> M3 -> M1 -> M4 -> M1 -> M2
-
广度优先: M5 -> M3 -> M4 -> M1 -> M2
-
C3(反向广度优先): M5 -> M4 -> M3 -> M2 -> M1
-
python使用的是C3算法
-
如果两个对象之间没有继承关系, 多重继承可以将他们, 关联起来
class A:
def f(self):
print(self[0])
# 关联A类, list类
class B(A, list):
pass
list1 = B()
list1.append(1)
list1.f()
class A:
def __init__(self):
self.a = 'A'
def print(self):
print(self.a)
class B:
def __init__(self):
self.b = 'B'
def print(self):
print(self.b)
class C(A, B):
def __init__(self):
# init A
super().__init__()
# init B
B.__init__(self)
# 调用A.print()
A.print(self)
# 调用B.print()
B.print(self)
c = C()
c.print() # 调用A.print()
print(C.mro()) # c3算法的调用顺序
输出
A
B
A
[<class '__main__.C'>, <class '__main__.A'>, <class '__main__.B'>, <class 'object'>]
from dataclasses import dataclass
@dataclass
class people:
name: str
age: int
# 以上等同于
class people:
def __init__(self, name: str, age: int):
self.name = name
self.age = age
example = people('tz', 24)
print(example)
frozen
: 设置只读, 默认为False
@dataclass(frozen=True)
class people:
name: str
age: int
@dataclass(frozen=True)
class people:
name: str
age: int
def __post_init__(self):
self.age = self.age.upper()
example = people('tz')
field(default_factory=function)
变量赋值为函数返回值
from dataclasses import dataclass, field
def f():
return 24
@dataclass
class people:
name: str
age: int = field(default_factory=f)
example = people(name = 'tz')
print(example)
type()
是python默认的元类
a = 1
print(a.__class__)
print(a.__class__.__class__)
输出
<class 'int'>
<class 'type'>
-
type()
装饰类, 类似与装饰器于函数- 三个参数
- 1.返回类型
- 2.继承的类, 用元组表示
- 3.类字典: 类的属性, 方法
# creating a base class class Base: def myfun(self): print("This is inherited method!") def f(self): print("This is Test class method!") # 装饰类 Test = type('Test', (Base, ), dict(x = 1, my_method=f)) o = Test() o.myfun() o.my_method() print(o.x)
- 三个参数
-
使用
__new__
元类代替函数装饰器- 装饰器例子
def debugmethods(cls): # vars() 字典类型:类的方法, 方法对象 for key, val in vars(cls).items(): # callable() 判断对象能否调用, 函数, 类都为True if callable(val): # setattr() 对类添加属性或方法 setattr(cls, key, val) return cls @debugmethods class Calc: def add(self, x, y): return x+y def mul(self, x, y): return x*y def div(self, x, y): return x/y mycal = Calc() print(mycal.add(2, 3)) print(mycal.mul(5, 2))
- 元类例子
def debugmethods(cls): for key, val in vars(cls).items(): if callable(val): setattr(cls, key, val) return cls # 元类 class Meta(type): def __new__(cls, clsname, bases, clsdict): obj = super().__new__(cls, clsname, bases, clsdict) # 装饰类 obj = debugmethods(obj) return obj # 继承元类 class Base(metaclass=Meta):pass # 继承Base class Calc(Base): def add(self, x, y): return x+y # 继承Calc class Calc_adv(Calc): def mul(self, x, y): return x*y mycal = Calc_adv() print(mycal.add(2, 3)) print(mycal.mul(5, 2))
-
__new__()
创建实例, 并返回实例- 在init()之前调用的方法, 可以重写这个方法来控制如何创建实例
-
__init__()
初始化实例: 将参数传递给已创建的实例类实例化的内部函数
class cls(object): pass a = cls.__new__(cls) if isinstance(a, cls): cls.__new__(a) # 以上等同于 a = cls()
- 通过
__new__
调用实现类实例缓存, 使相同名字的类实例只有一个- 缺点: 这个方法每次都会调用__init__
class O: # 通过字典缓存 cache = {} def __new__(cls, name): if name in cls.cache: return cls.cache[name] else: self = super().__new__(cls) cls.cache[name] = self return self def __init__(self, name): print('init') self.name = name a = O('tz') b = O('tz') print(a is b) # True
- 通过
-
__del__()
destructors(析构函数)类引用次数为0时,删除资源的内部析构函数
尽量不要自定义
__del__()
,否则在以下循环引用的例子会导致内存泄漏class A: def __init__(self, b): self.b = b class B: def __init__(self): self.a = A(self) def __del__(self): print("die") b = B()
-
__call__
像函数那样调用类.class()class people(object): def __init__(self): self.name = 'tz' def __call__(self): print('class call') # test a = people() a() # callable()判断能否对象调用, 返回True print(callable(a))
-
__file__
查看模块路径import re re.__file__
-
__dict__
查看class的self变量clss的self变量,使用字典保存
class people(object): height = 180 __weight = 100 def __init__(self): self.name = 'tz' def age(self, n): self.age = n a = people() a.age(24) a.__dict__ # a.name 等同于a.__dict__['name'] a.__dict__['name']
-
__slots__
使用tuple(元组)代替dictionary(字典)保存self变量每个class的dictionary浪费大量内存, 而__slots__是一种减少内存的方法
注意: 依赖__dict__代码将无法使用
class people(object): __slots__ = ['name', 'age'] def __init__(self): self.name = 'tz' self.age = 24 a = people() print(a.__slots__)
-
__str__()
和__repr__()
__str__
: 给用户看的. print()或str(), 才会被调用
class people(object): def __init__(self, name): self.name = name def __str__(self): return 'in __str__: my name is %s' % self.name print(people('tz')) # 给用户看的, 不可以直接调用 people('tz')
输出
in __str__: my name is tz <__main__.people object at 0x7f250de170a0>
__repr__
: 给开发者看的. 执行print([class_name]), 传入列表, 返回包含字符串的列表
class people(object): def __init__(self, name): self.name = name def __repr__(self): return 'in __str__: my name is %s' % self.name print(people('tz')) print([people('tz')]) # 给开发者看的, 可以直接调用 people('tz')
输出
in __str__: my name is tz [in __str__: my name is tz] in __str__: my name is tz
-
@property: 是一个类装饰器
只读
class people(object): def __init__(self, name = 'tz'): self.name = name @property def age(self): return 24 # test a = people().age # 实例化后无法修改 a.age = 23
访问私有变量
class people(object): __age = 24 def __init__(self, name = 'tz'): self.name = name @property def age(self): return self.__age # test people().age
-
@name.setter
可以修改
@property
的属性class people(object): def __init__(self, name = 'tz'): self.name = name @property def age(self): return 24 @age.setter def age(self, age): # 这里不能是self.age = age people.age = age a = people() a.age a.age = 100
修改私有变量
class people(object): __age = 24 def __init__(self, name = 'tz'): self.name = name @property def age(self): return self.__age @age.setter def age(self, age): # 这里不能是__age = age people.__age = age a = people() a.age a.age = 100
-
@classsmethod
不需要实例化,就能访问方法
# 访问私有变量 class people(object): __height = 180 def __init__(self, name = 'tz'): self.name = name @classmethod def height(cls): return cls.__height people().height() # 继承 class new_people(people): pass new_people.height()
-
@staticmethod
类外的函数, 不需要强制传递self参数, 不能对类造成影响
class people(object): def __init__(self, name = 'tz'): self.name = name def inside(self): a = 1 # 可以修改类的属性 self.name = 'inside_tz' return a @staticmethod # 不需要传递self参数 def outside(): a = 1 return a # test a = people() a.name a.outside() a.inside() a.name
-
getattr(): 字符串调用类的方法
# 例子1 class A: def f(self, x): print(x) a = A() getattr(a, 'f')(0) # Calls a.f(0) # 例子2 list1 = [] getattr(list1, 'append')(0) # Calls list1.append(0)
- operator.methodcaller(): 另一种字符串调用类的方法
import operator list1 = [] operator.methodcaller('append', 0)(list1)
- operator.methodcaller(): 另一种字符串调用类的方法
-
setattr()
class people:
def __init__(self):
self.name = 'tz'
# 添加age属性
setattr(people, 'age', 24)
# 可以直接调用
people.age
- contextlib.contextmanager() 装饰器
yield
之前的代码为__enter__()
yield
之后的代码为`exit()
from contextlib import contextmanager
@contextmanager
def list_transaction(list1):
print('start', list1)
yield list1
print('end', list1)
list1 = []
with list_transaction(list1) as list_t:
list_t.append(1)
list_t.append(2)
- 包装try语句
from contextlib import contextmanager
@contextmanager
def list_transaction(list1):
print('start', list1)
try:
yield list1
except AttributeError as err:
print('error:', err)
finally:
print('end', list1)
list1 = []
with list_transaction(list1) as list_t:
list_t.append(1)
list_t.append(2)
# 报错
list_t.ap
输出
start []
error: 'list' object has no attribute 'ap'
end [1, 2]
-
__getitem__
函数:class o: def __getitem__(self, x): print(x) class o1(o): pass # ()调用__getitem__函数, []表示传入参数 o()['test'] o1()['test']
-
fib例子
class fib: def __getitem__(self, n): x, y = 1, 1 for i in range(n): x, y = y, x + y return x print(fib()[7])
-
-
__class_getitem__
函数:class o: # cls表示类, item表示参数 def __class_getitem__(cls, item): print(f"{cls.__name__}[{item.__name__}]") class o1(o): pass # []调用__class_getitem__函数, 并传入参数 o1[int]
- 注意:对
__
开头和结尾的属性并不适用
class people:
def __init__(self, name):
self.name = name
o = people('tz')
o.name
# 报错: 不存在age属性
o.age
- 使用
__getattribute__
class people:
def __init__(self, name):
self.name = name
def __getattribute__(self, attr):
# 如果是age, 就返回对应的值
if attr == 'age':
return 24
o = people('tz')
o.name
o.age
- 装饰器
def new_getattr(cls):
orig_getattribute = cls.__getattribute__
def new_getattribute(self, attr):
print(attr)
return attr
cls.__getattribute__ = new_getattribute
return cls
@new_getattr
class people:
pass
o = people()
o.age # age
-
代理:
- 普通写法
class A: def f(self): pass def f1(self, x): pass class B: def __init__(self): self._a = a() def f(self): return self._a.f() def f1(self, x): return self._a.f1(x) b = B() b.f() b.f1(1)
- 使用
__getattr__
class Proxy: def __init__(self, obj): self._obj = obj def __getattr__(self, name): return getattr(self._obj, name) a = A() b = Proxy(a) b.f() # Calls B.__getattr__('f') b.f1(1)
-
代理列表对象
class ListLike:
def __init__(self):
# 列表对象
self._obj = []
def __getattr__(self, name):
# 找不到属性和方法时, 调用列表对象的方法
return getattr(self._obj, name)
list1 = ListLike()
# 像列表一样append
list1.append(1)
# 报错: 不支持__len__, 需要自行写入
len(list1)
from functools import partialmethod
class Demo:
def __init__(self):
self.color = 'black'
def _color(self, type):
self.color = type
set_red = partialmethod(_color, type='red')
set_blue = partialmethod(_color, type='blue')
set_green = partialmethod(_color, type='green')
obj = Demo()
print(obj.color)
obj.set_blue()
print(obj.color)
输出
black
blue
- False: 自定义类的hash值和id()值相关, 两个实例的id()值并不相同
class Node(object):
def __init__(self, x, y):
self.x = x
self.y = y
a = Node(1, 1)
b = Node(1, 1)
set1 = set()
set1.add(a)
# False
b in set1
- 设置
__hash__
,__eq__
class Node(object):
def __init__(self, x, y):
self.x = x
self.y = y
def __hash__(self):
return hash((self.x, self.y))
def __eq__(self, o):
return self.x == o.x and self.y == o.y
a = Node(1, 1)
b = Node(1, 1)
set1 = set()
set1.add(a)
# True
b in set1
- 通过setattr()实现描述器, 将参数转换成属性
class Descriptor:
def __init__(self, name=None, **opts):
self.name = name
for key, value in opts.items():
setattr(self, key, value)
def __set__(self, instance, value):
instance.__dict__[self.name] = value
o = Descriptor('test', a=1, b=2)
o.a = 10
o.b = 20
- 实现类型检查器
class Typed(Descriptor):
# 类型
expected_type = type(None)
def __set__(self, instance, value):
# 判断类型
if not isinstance(value, self.expected_type):
raise TypeError('expected ' + str(self.expected_type))
super().__set__(instance, value)
class Int(Typed):
expected_type = int
- 实现值的判断
class Unsigned(Descriptor):
def __set__(self, instance, value):
if value < 0:
raise ValueError('Expected >= 0')
super().__set__(instance, value)
# 描述器
class Descriptor:
def __init__(self, name=None, **opts):
self.name = name
for key, value in opts.items():
setattr(self, key, value)
def __set__(self, instance, value):
instance.__dict__[self.name] = value
# 类型检查
class Typed(Descriptor):
expected_type = type(None)
def __set__(self, instance, value):
# 判断类型
if not isinstance(value, self.expected_type):
print(self.expected_type)
raise TypeError('expected ' + str(self.expected_type))
super().__set__(instance, value)
# 类型
class Int(Typed):
expected_type = int
class Float(Typed):
expected_type = float
class Str(Typed):
expected_type = str
class Stock:
a = Str('a')
b = Int('b')
c = Float('c')
def __init__(self, a, b, c):
self.a = a
self.b = b
self.c = c
# 报错: 第二个参数不是int
Stock('tz', 's', 1.1)
# 不报错
Stock('tz', 0, 1.1)
- 装饰器写法
class Typed:
def __init__(self, name, expected_type):
self.name = name
self.expected_type = expected_type
def __get__(self, instance, cls):
if instance is None:
return self
else:
return instance.__dict__[self.name]
def __set__(self, instance, value):
if not isinstance(value, self.expected_type):
raise TypeError('Expected ' + str(self.expected_type))
instance.__dict__[self.name] = value
# Class decorator that applies it to selected attributes
def typeassert(**kwargs):
def decorate(cls):
for name, expected_type in kwargs.items():
# Attach a Typed descriptor to the class
setattr(cls, name, Typed(name, expected_type))
return cls
return decorate
# Example use
@typeassert(name=str, shares=int, price=float)
class Stock:
def __init__(self, name, shares, price):
self.name = name
self.shares = shares
self.price = price
# 报错: 第二个参数不是int
Stock('tz', 's', 1.1)
# 不报错
Stock('tz', 0, 1.1)
tuple, int不支持弱引用
-
gc(垃圾回收)
- 对象的引用数变成0时才会被gc
import weakref
import sys
class Data:
def __del__(self):
print(id(self), 'dead')
a = Data()
# 引用计数
sys.getrefcount(a) # 2
# 创建弱引用
w = weakref.ref(a)
w() is a # True
# 引用计数并没有增加
sys.getrefcount(a) # 2
# 删除原对象后, 弱引用w也会失效
del a # 139893964471840 dead
权限 | 操作 |
---|---|
r | 只读(不会覆盖文件) |
w | 只写(如果文件不存在就创建, 覆盖文件) |
x | 只写(文件不存在才写入) |
r+ | 读写(不会覆盖文件) |
w+ | 读写(如果文件不存在就创建, 覆盖文件) |
rb+ | 读写二进制文件 |
wb+ | 只写二进制文件 |
xb | 只写二进制文件(文件不存在才写入) |
a | 只写追加尾部 |
a+ | 读写追加尾部(如果文件不存在就创建) |
- 写入文件
with open('/tmp/test', 'w') as file:
data = "123 321 abc ABC"
file.write(data)
# 使用print()写入
with open('/tmp/test', 'w') as file:
print('123 321 abc ABC', file=file)
- 读取文件
# 指定编码 file = open('/tmp/test', 'r', encoding='utf-8')
# 文件必须存在
file = open('/tmp/test')
# 只能读取一次
print(file.read())
file.close()
# with 能读取多次
with open('/tmp/test') as file:
data = file.read()
- 读取多个文件
# 边读边写,将首字符转为大写
with open('/tmp/test', 'r') as intf, open('/tmp/test1', 'w') as outf:
for line in intf:
print([word.capitalize() for word in line.split()], file=outf)
# for 读取
file = ('test', 'test1')
for i in file:
f = open(i)
print(f.read())
f.close()
- 防止读取错误
# 防止文件不存在, 报错
if not os.path.exists(file):
os.mknod(file)
# 防止读取空文件
with open(f, 'r') as file:
try:
page_dict |= yaml.load(file)
except:
pass
- 读写二进制文件
# 读取需要解码, 写入需要编码
with open('somefile.bin', 'rb') as f:
data = f.read(16)
text = data.decode('utf-8')
with open('somefile.bin', 'wb') as f:
text = 'Hello World'
f.write(text.encode('utf-8'))
-
和 read() 不同的是, readinto() 填充已存在的缓冲区, 而不是为新对象重新分配内存再返回它们
-
可以避免大量的内存分配操作
-
返回实际读取的字节数
-
-
二进制io, 可以直接读写C结构, 比如array(数组)
import array
nums_write = array.array('i', [1, 2, 3, 4])
with open('/tmp/test.bin', 'wb') as f:
f.write(nums_write)
nums_read = array.array('i', [0, 0, 0, 0])
with open('/tmp/test.bin', 'rb') as f:
f.readinto(nums_read)
print(nums_read)
- 读取文件到一个数组里
import os.path
def read_into_buffer(filename):
# 设置缓冲区为文件的大小
buf = bytearray(os.path.getsize(filename))
with open(filename, 'rb') as f:
f.readinto(buf)
return buf
# /tmp/file 内容为: 1234567890
array = read_into_buffer('/tmp/file')
# 使用数组切片
print(array[0:5]) # b'12345'
- 使用mmap 模块内存映射文件, 实现数组切片
import os
import mmap
def memory_map(filename, access=mmap.ACCESS_WRITE):
size = os.path.getsize(filename)
fd = os.open(filename, os.O_RDWR)
return mmap.mmap(fd, size, access=access)
# mmap.ACCESS_COPY 只写内存, 而不会写入文件
file = memory_map('/tmp/file', mmap.ACCESS_COPY)
# 修改第一个字符, 57是字符串9的ascii码
file[0] = 57
print(file[0:5]) # b'92345'
-
序列化数组
import numpy as np
import h5py
# 生成一个大数据集
arr = np.random.randn(1000)
# 写入. 数组的名字为arary1
with h5py.File('/tmp/test.hdf5', 'w') as f:
dset = f.create_dataset("array1", data=arr)
# 读取
with h5py.File('/tmp/test.hdf5', 'r') as f:
data = f['array1']
print(min(data))
print(max(data))
print(data[:15])
- 压缩文件
# 默认压缩等级是4
with h5py.File('/tmp/test.hdf5', 'w') as f:
dset = f.create_dataset('array1', data=arr1, compression="gzip", compression_opts=9)
- 一个文件保存多个数组
import numpy as np
import h5py
arr1 = np.random.randn(1000)
arr2 = np.random.randn(1000)
# 写入两个数组
with h5py.File('/tmp/test.hdf5', 'w') as f:
dset = f.create_dataset("array1", data=arr1)
dset = f.create_dataset("array2", data=arr2)
# 读取两个数组
with h5py.File('/tmp/test.hdf5', 'r') as f:
arr1 = f['array1']
arr2 = f['array2']
# 查询arr1大于0的值的位置, 再读取arr2所对应的位置. [:]表示加载到内存
data = arr2[arr1[:]>0]
-
多个字符串写入
-
如果两个字符串很小,第一个更好,因为I/O系统调用天生就慢
-
如果两个字符串很大,第二个更好,因为它避免了创建一个很大的临时结果并且要复制大量的内存块数据
# Version 1 f.write(str1 + str2) # Version 2 f.write(str1) f.write(str2)
- 定义写入大小函数
def sample(): yield 'Is' yield 'Chicago' yield 'Not' yield 'Chicago?' def combine(source, maxsize): parts = [] size = 0 for part in source: parts.append(part) size += len(part) if size > maxsize: print('#') yield ''.join(parts) parts = [] size = 0 yield ''.join(parts) with open('/tmp/file', 'w') as f: for part in combine(sample(), 32768): f.write(part)
-
-
二进制读
- functools.partial: 每次被调用时读取固定字节的可调用对象
from functools import partial
# 数据块大小
SIZE = 32
with open('/tmp/file', 'rb') as f:
records = iter(partial(f.read, SIZE), b'')
for r in records:
print(r)
import gzip
# 必须是二进制字符串
data = b'test'
# 写入
with gzip.open('/tmp/file.gz', 'w') as f:
f.write(data)
# 读取
with gzip.open('/tmp/file.gz', 'r') as f:
data = f.read()
- compresslevel: 设置压缩等级
with gzip.open('/tmp/file.gz', 'w', compresslevel=5) as f:
f.write(data)
-
对变量的转换: 带s的方法loads(), dumps()
-
对文件的读写: 不带s的方法load(), dump()
import json
# loads() str内的dict转json.注意:字符串外层必须是',字典内必须是"
str_dict = '{"a": 1, "b": 2}'
json.loads(str_dict)
# dumps() dict转换json.'' 变成 ""
dict1 = {'a': 1, 'b': 2}
json.dumps(dict1)
json.dumps(dict1, indent = 4, sort_keys=True)
# load() 读取json文件
with open('test.json') as file:
data = json.load(file)
# dump() 写入json文件
with open('test.json', 'w') as file:
json.dump(dict1, file)
# ensure_ascii=False(默认使用ascii编码) 防止中文乱码
with open('test.json', 'w') as file:
json.dump(dict1, file, ensure_ascii=False)
- 通过object_hook参数, 将json字典转为对象
class JSONObject:
def __init__(self, d):
self.__dict__ = d
s = '{"name": "tz", "age": 24}'
data = json.loads(s, object_hook=JSONObject)
# 读取属性
data.name # tz
data.age # 24
- 通过object_pairs_hook参数(json只有list, dict), 传递给其他类型
from collections import OrderedDict
s = '{"name": "tz", "age": 24}'
# 创建OrderedDict类型
data = json.loads(s, object_pairs_hook=OrderedDict)
print(data) # OrderedDict([('name', 'tz'), ('age', 24)])
- pprint.pprint(): 更好的显示
from pprint import pprint
data = {
'completed_in': 0.074,
'max_id': 264043230692245504,
'max_id_str': '264043230692245504',
'next_page': '?page=2&max_id=264043230692245504&q=python&rpp=5',
'page': 1
}
print(json.dumps(data, indent=4))
# 或者
json_str = json.dumps(data)
pprint(json_str)
- 没有代码注入的安全问题
orjson更好的json库
- 支持datetime, numpy
import redis
import json
dict1 = {
'a': 1,
'b': 2
}
r = redis.StrictRedis()
# 写入redis
r.json().set('doc', '$', json.dumps(dict1))
# 读取redis
reply = json.loads(r.json().get('doc'))
操作类似json
import yaml
# 读取json文件
with open('test.json') as file:
data = yaml.load(f)
# 使用utf编码, 写入文件
with open('test.yaml', 'w') as file:
yaml.dump(dict1, file, allow_unicode=True)
-
yaml.load()
和pickle.loads()
一样有代码注入的安全问题print(yaml.load('!!python/tuple [1, 2, 3]'))
- 解决方法: 使用
yaml.safe_load()
代替
yaml.safe_load(f)
- 解决方法: 使用
from configparser import ConfigParser
cfg = ConfigParser()
cfg.read("/tmp/ini.ini")
# 查看所有段
cfg.sections()
# 读取installation段的library键字符串值
cfg.get("installation", "library")
# 读取debug段的log_errors键布尔值
cfg.getboolean("debug", "log_errors")
# 读取server段的nworkers键的int值
cfg.getint("server", "nworkers")
-
安全问题
-
数据和指令保存在一起不加区分, 会有代码注入风险, 因此不要对未知来源的数据进行loads(反序列化)
-
序列化对象
- 通过pickle将数据和命令, 在进程之间进行传输
-
dumps()
序列化,loads()
反序列化
import pickle, base64
list1 = [1, 2, 3]
# dumps()6个版本
print(pickle.dumps(list1, protocol=0))
print(pickle.dumps(list1, protocol=1))
print(pickle.dumps(list1, protocol=2))
print(pickle.dumps(list1, protocol=3))
print(pickle.dumps(list1, protocol=4))
print(pickle.dumps(list1, protocol=5))
# 使用base64加密
print(base64.b64encode(pickle.dumps(list1, protocol=0)))
# loads可以自动识别版本
print(pickle.loads(b'(lp0\nI1\naI2\naI3\na.'))
输出:
b'(lp0\nI1\naI2\naI3\na.'
b']q\x00(K\x01K\x02K\x03e.'
b'\x80\x02]q\x00(K\x01K\x02K\x03e.'
b'\x80\x03]q\x00(K\x01K\x02K\x03e.'
b'\x80\x04\x95\x0b\x00\x00\x00\x00\x00\x00\x00]\x94(K\x01K\x02K\x03e.'
b'\x80\x05\x95\x0b\x00\x00\x00\x00\x00\x00\x00]\x94(K\x01K\x02K\x03e.'
b'KGxwMApJMQphSTIKYUkzCmEu'
[1, 2, 3]
dump()
,load()
import pickle
integers = [1, 2, 3, 4, 5]
# 写入
with open('file', 'wb') as file:
pickle.dump(integers, file)
# 读取
with open('file', 'rb') as file:
integers = pickle.load(file)
print(integers)
optimize()
: 优化dumps()
后的序列化对象
import pickle, pickletools
list1 = [1, 2, 3]
list1_pickle = pickle.dumps(list1,protocol=0)
print(list1_pickle)
# 优化
list1_pickle = pickletools.optimize(list1_pickle)
print(list1_pickle)
输出
b'(lp0\nI1\naI2\naI3\na.'
b'(lI1\naI2\naI3\na.'
dis()
反编译序列化对象
pickletools.dis(list1_pickle)
输出
0: ( MARK
1: l LIST (MARK at 0)
2: I INT 1
5: a APPEND
6: I INT 2
9: a APPEND
10: I INT 3
13: a APPEND
14: . STOP
highest protocol among opcodes = 0
-
- 在pickle之上并实现一个序列化dict(字典)
import shelve
file = shelve.open('/tmp/data')
a = ['123', '321']
# 写入
file['a'] = a
# 读取
file['a']
# 关闭
file.close()
from pathlib import Path
p = Path('/home/tz/notes/python.md')
# 查看当前程序的绝对路径
Path(__file__).parent.absolute()
# 当前目录
Path().absolute()
# 父目录
p.parent
p.parents[0]
p.parents[1]
p.parents[2]
p.cwd().is_dir()
# 文件名
p.name
# 去除拓展名
p.stem
# 只显示拓展名
p.suffix
# 显示多个拓展名
p.suffixes
# 以.为分格符返回列表
Path('my.tar.bz2').suffixes
# 文件属性
p.stat()
# touch
p = Path('/tmp/test')
p.touch()
# 写入文件,会删除文件原有内容
p.write_text('123\n123\n')
# 读取文件
p.read_text()
# 或者
with p.open() as f:
for line in f:
print(line)
# 读取当前目录下的文件。类似于ls命令
for i in p.iterdir():
print(i)
# 生成当前目录下的文件list
flist = [p for p in Path('.').iterdir() if p.is_file()]
# 返回PosixPath对象
p = Path('.')
[i for i in p.glob('*')]
# 区分文件和目录
for i in p.glob('*'):
if i.is_file():
print('file: ' + i)
elif i.is_dir():
print('dir: ' + i)
# find .
for i in p.glob('**/*'):
print(i)
- time
from datetime import datetime, timezone, timedelta
file = pathlib.Path('/tmp/test')
# 获取atime. 格式为unix timestamp
unixtime = file.stat().st_atime
# 修改时区
tz = timezone(timedelta(hours=+8))
time = datetime.fromtimestamp(unixtime, tz=tz)
# 转换
time.strftime("%Y-%m-%d %H:%M:%S") # 2021-12-23_11:16:37
- join(): 连接目录
import os
path = os.path.join('/', 'usr', 'lib', 'local')
方法 | 操作 |
---|---|
os.fork() | 创建子进程 |
os.setsid() | 创建新会话, 并设置子进程为进程组组长 |
import os
# 创建目录
os.mkdir(filepath)
# 创建文件
os.mknod(filepath)
# 创建ipc文件
os.mkfifo(filepath)
# 删除文件
os.unlink(filepath)
# 获取拓展名
os.path.splitext(file)[1]
# 获取当前目录下所有拓展名(包括子目录)
suffix = set()
for root, dirs, files in os.walk(".", topdown=False):
for name in files:
suffix.add(os.path.splitext(name)[1])
from os import walk
# 输出文件和目录(包括子目录下的文件)
for root, dirs, files in os.walk(".", topdown=False):
for name in files:
print(os.path.join(root, name))
for name in dirs:
print(os.path.join(root, name))
# 将名为BBB的目录,改名为AAA
for root, dirs, files in os.walk(".", topdown=False):
for name in dirs:
if name == 'BBB':
src = (os.path.join(root, name))
dst = (os.path.join(root, 'AAA'))
os.rename(src,dst)
- random.random: 返回0 - 1 的浮点数
from random import random
random() # 0.20777445807227546
- randint(): 返回范围内的数
from random import randint
randint(1, 10) # 4
- choice(), sample(): 随机选取元素
from random import choice, sample
list1 = [1, 2, 3, 4, 5]
str1 = '12345'
# choice 只能返回一个值
choice(list1) # 3
choice(str1) # 3
# sample 以列表形式返回, 支持返回多个值
sample(list1, 1) # [2]
sample(list1, 2) # [1, 3]
sample(str1, 1) # ['2']
sample(str1, 2) # ['1', '3']
- shuffle(): 对列表的元素随机排序
from random import shuffle
shuffle(list1) # [2, 3, 1, 4, 5]
import logging
def main():
# 设置为DEBUG, 也就是输出所有等级
logging.basicConfig(filename="app.log", level=logging.DEBUG)
# 5个等级
logging.critical('log')
logging.error('log')
logging.warning('log')
logging.info('log')
logging.debug('log')
if __name__ == "__main__":
main()
输出
CRITICAL:root:log
ERROR:root:log
WARNING:root:log
INFO:root:log
DEBUG:root:log
- 设置
import logging
# 设置等级debug以上才输出
logging.basicConfig(level=logging.DEBUG)
# 禁用等级CRITICAL以下的输出
logging.disable(logging.CRITICAL)
# filemode='w' 覆盖写入; 默认为'a' 追加写入
logging.basicConfig(filename='app.log', filemode='w')
# 默认格式
logging.basicConfig(format='%(levelname)s:%(name)s:%(message)s')
# 修改格式
logging.basicConfig(format='%(name)s - %(levelname)s - %(message)s')
# 设置时间格式
logging.basicConfig(datefmt='%d-%b-%y %H:%M:%S')
- 配置文件
; /tmp/log.ini
[loggers]
keys=root
[handlers]
keys=defaultHandler
[formatters]
keys=defaultFormatter
[logger_root]
level=INFO
handlers=defaultHandler
qualname=root
[handler_defaultHandler]
class=FileHandler
formatter=defaultFormatter
args=('app.log', 'a')
[formatter_defaultFormatter]
format=%(levelname)s:%(name)s:%(message)s
- 读取配置文件
import logging.config
logging.config.fileConfig('/tmp/log.ini')
-
不需要繁琐的配置
time | 符号 |
---|---|
年 | %Y |
月 | %m |
日 | %d |
时 | %H |
分 | %M |
秒 | %S |
import time
# 年 月 日 时 分 秒
current_time = time.strftime("%Y%m%d%H%M%S", time.localtime(time.time()))
print(current_time)
- 统计函数运行的时间
from time import time, sleep
start = time()
sleep(1)
end = time()
print('%.2f秒' % (end - start))
- 计时器
import time
class Timer:
# time.perf_counter()时间精度最高
def __init__(self, func=time.perf_counter):
self.time = 0.0
self._func = func
self._start = None
def start(self):
if self._start is not None:
raise RuntimeError('Already started')
self._start = self._func()
def stop(self):
if self._start is None:
raise RuntimeError('Not started')
end = self._func()
self.time += end - self._start
self._start = None
def reset(self):
self.time = 0.0
@property
def running(self):
return self._start is not None
# with语句
def __enter__(self):
self.start()
return self
def __exit__(self, *args):
self.stop()
def countdown(n):
while n > 0:
n -= 1
if __name__ == '__main__':
t = Timer()
# 不使用with
t.start()
countdown(100)
t.stop()
print(t.time)
# with语句
with t:
countdown(100)
print(t.time)
- 注意:判断字符串开头应使用
str.startswith()
, 而不是re
a = '123abc 192.168.1.1 ABC\n1.1.1.1\nabc ABC\n999.999.999.999\n<meta name="user-login" content="ztoiax">'
# findall() 返回列表. 匹配ip地址
re.findall('\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}', a)
# match() 从头匹配. 成功则返回对象, 匹配失败则返回false
aa = re.match('\d{1,3}', a)
# group() 返回匹配后的字符串
ip = aa.group()
# search() 返回第一个符合匹配
aa = re.search('\d+', a)
ip = aa.group()
# group(1)提取
match = re.search(r'"user-login" content="(.*?)"', a)
name = match.group(1)
# sub 替换
# 所有数字替换成0
aa = re.sub('\d','0', a)
# split 拆分
aa = re.split('[\n\b.]',a)
compile()
生成对象
pattern = re.compile('\d{1,3}')
# search() 只返回第一个匹配
pattern.search(a).group()
# finall() 返回所有匹配
re.findall(pattern, a)
- flags
# re.DOTALL 换行符
re.compile('.*', re.DOTALL)
# re.I 不区分大小写
re.compile('regex', re.I)
# re.VERBOSE 忽略空格等字符,需要'''
re.compile('''\d{1,3}\.
\d{1,3}\.
\d{1,3}\.
\d{1,3}''', re.VERBOSE)
# 多flags
re.compile('.*', re.DOTALL | re.I | re.VERBOSE)
- 剪切板匹配并连接在一起
import pyperclip, re
phoneRegex = re.compile(r'something')
text = str(pyperclip.paste())
for groups in phoneRegex.findall(text):
phoneNum = '-'.join([groups[1], groups[3], groups[5]])
import fnmatch
import os
pattern = '*py*'
files = os.listdir('.')
print ('Matches :', fnmatch.filter(files, pattern))
- 内置库
import hashlib
name = 'tz'
# md5
token = hashlib.md5(name.encode(encoding='UTF-8')).hexdigest()
# sha256
token = hashlib.sha256(name.encode(encoding='UTF-8')).hexdigest()
# sha3_512
token = hashlib.sha3_512(name.encode(encoding='UTF-8')).hexdigest()
-
编译:
- file:
helloworld.pyx
print('hello world')
- file:
setup.py
from setuptools import setup from Cython.Build import cythonize import numpy setup( ext_modules = cythonize("helloworld.pyx"), include_dirs=[numpy.get_include()] )
- 编译生成文件:
helloworld.c
,helloworld.cpython-39-x86_64-linux-gnu.so
python setup.py build_ext --inplace
- 运行
import helloworld
- file:
-
-
使用cython快1.8倍
-
使用cython静态类型快16倍
-
|>
管道
10 |> range |> list |> print
import this
-
- 这是一个用于自动升级 Python 代码,以适应新版本语法的工具。支持升级到不同的 Python 版本、提供预览模式即查看改动变化等功能。
-
Understanding Linux cp Command and Implementing in Python(用os库实现cp命令)
-
- 作者用 Django + 简单的 HTML + Whisper + mixtral-8x7b-instruct + SQLite 实现了一个 TODO 项目
-
searxng:Python 元搜索引擎。该项目是基于 Flask 构建的聚合搜索引擎,能够整合来自 70 多个搜索引擎的搜索结果。