Go笔记（7）性能优化

这篇记录 go 中的并发编程中的性能优化。buffered channel 对象池、sync.Pool。
安装使用性能分析工具。

对象池

对于创建时代价比较高的对象（如：网络连接），通常将对象进行池化以避免重复创建。
可以使用 buffered channel 实现对象池。采用 lock 机制，需要考虑同步机制对性能的影响，可用 benchmark 进行评估是否真的性能得到优化。
一般使用不同的池缓存不同类型的对象。

type ReusableObj struct {} // 为了实例使用了空结构

type ObjPool struct {
	bufChan chan *ReusableObj // 用于缓冲可重用对象
}

func NewObjPool(numOfObj int) *ObjPool {
	objPool := ObjPool{}
	objPool.bufChan = make(chan *ReusableObj, numOfObj) // 创建对象池
	for i := 0; i < numOfObj; i++ {
		objPool.bufChan <- &ReusableObj{} // 在对象池中加入结构（比如：连接，一些难以创建的对象）
	}
	return &objPool
}
// 定义在对象池的指针上（创建时非必须，get时必须）
func (p *ObjPool) GetObj(timeout time.Duration) (*ReusableObj, error) {
	select {
	case ret := <-p.bufChan:
		return ret, nil
	case <-time.After(timeout): // 超时控制
		return nil, errors.New("time out")
	}
}
func (p *ObjPool) ReleaseObj(obj *ReusableObj) error {
	select {
	case p.bufChan <- obj:
		return nil
	default: // 无法放入对象池（如：超出size），产生阻塞
		return errors.New("overflow")
	}
}

func TestObjPool(t *testing.T) {
	pool := NewObjPool(10)
	// if err := pool.ReleaseObj(&ReusableObj{}); err != nil { //尝试放置超出池大小的对象
	// 	t.Error(err)
	// }
	for i := 0; i < 11; i++ {
		if v, err := pool.GetObj(time.Second * 1); err != nil {
			t.Error(err)
		} else {
			fmt.Printf("%T\n", v)
			if err := pool.ReleaseObj(v); err != nil {
				t.Error(err)
			}
		}
	}
	fmt.Println("Done")
}

sync.Pool

进程中包含私有对象（协程安全）和共享池（协程不安全）。sync.Pool 对象获取顺序：

尝试从私有对象获取
私有对象不存在，则从当前Processor 的共享池获取（需要 lock）
如果当前 Processor 共享池也是空的，那么就尝试去其他
Processor 的共享池获取
如果所有池都是空的，最后就用用户指定的 New 函数产生一个新的对象返回

sync.Pool 对象的放回顺序：

如果私有对象不存在则保存为私有对象
如果私有对象存在，放入当前 Processor 子池的共享池中

sync.Pool 对象的生命周期
• GC 会清除 sync.pool 缓存的对象
• 对象的缓存有效期为下一次 GC 之前

sync.Pool 适合于通过复用，降低复杂对象的创建和 GC 代价；协程安全，会有锁的开销；生命周期受GC 影响，不适合于做连接池等，需自己管理生命周期的资源的池化。

func TestSyncPool(t *testing.T) {
	pool := &sync.Pool{
		New: func() interface{} { // 所有池都是空的，产生一个新的对象返回
			fmt.Println("Create a new object.")
			return 100
		},
	}
	v := pool.Get().(int)
	fmt.Println(v)
	pool.Put(3)
	runtime.GC() //GC 会清除sync.pool中缓存的对象
	v1, _ := pool.Get().(int)
	fmt.Println(v1)
}

性能分析工具

文件输出

适用于短时间批量运行、细粒度的程序。

# 安装 graphviz 
brew install graphviz
# 将 $GOPATH/bin 加⼊ $PATH (Mac: 在 .bash_profile 中修改路径)
# 安装 go-torch，1.11已内置 
go get github.com/uber/go-torch 
# 下载并复制 flamegraph.pl （https://github.com/brendangregg/FlameGraph）⾄ $GOPATH/bin 路径下
# 将 $GOPATH/bin 加⼊ $PATH

使用 pprof 进行交互式查看，cpu

# go tool pprof 二进行文件名 生成的prof
go tool pprof prof cpu.prof
	top
	list ***
	svg # 生成一张图

通过 HTTP 方式输出 Profile

适合于持续性运行的应用。

1	`import (_ "net/http/pprof")`

访问：http://:/debug/pprof/

1 2	`go tool pprof http://<host>:<port>/debug/pprof/profile?seconds=10 #默认值为30秒 go-torch -seconds 10 http://<host>:<port>/debug/pprof/profile`

性能调优过程

设定优化目标-〉分析系统瓶颈点-〉优化瓶颈点

常见分析指标

• Wall Time：挂钟时间（函数运行的绝对时间，包括外部阻塞）
• CPU Time
• Block Time
• Memory allocation：内存分配
• GC times/time spent：GC次数、GC耗时
可以通过笔记（1）中测试相关分析瓶颈。

tips

减少 lock 使用，写多读少-ConcurrentMap，读多写少-sync.Map
复杂对象尽量传递引用（避免内存分配和复制）
切片初始化至合适大小

附录
Go 支持的多种 Profile

本博客所有文章除特别声明外，均采用 CC BY-SA 4.0 协议，转载请注明出处！

Go笔记（9）编程模式上一篇

Go笔记（8）服务下一篇