Use BBR TCP congestion control in OpenVZ
Assumption
I’m assuming you’re using Linux distribution, and in a KVM environment(Which the loop device will be used).
Any step need root privilege, the sudo
prefix will be added.
Making an Alpine Linux image
First thing you should do is to choose a Linux distribution, among all known Linux distros, Alpine Linux is a good choice, it’s small and simple, so it gonna be our choice.
The core idea is to making an UML(User-mode Linux) image.
Before get started, we assuming current directory is ~/uml
, you may put all your UML-related stuff in here.
Now making a empty image file, and mount to a folder:
ROOTFS="alpine_rootfs.img"
# Block size 1M, block count 192. thus the image sized 192M
dd if=/dev/zero of=$ROOTFS bs=1M count=192
# File system label name
LBL_NAME="ALPINE_ROOT"
# Using the file system of yours
mkfs.ext4 -L $LBL_NAME $ROOTFS
mkdir alpine
# Mount image file via loop device into alpine/ folder
sudo mount -o loop $ROOTFS alpine/
After that, lsblk
and mount
command will give you some clues:
# lsblk output
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 192M 0 loop /home/x/uml/alpine
# mount output
/home/x/uml/alpine_uml.img on /home/x/uml/alpine type ext4 (rw,relatime,data=ordered)
Choose your Alpine Linux version(currently it’s 3.6
), and using apk
tool to build the base system:
# Or "latest-stable"
REL="v3.6"
# Or using this to fetch latest version
# curl -sL https://dl-cdn.alpinelinux.org/alpine/ | grep -Eo "v[0-9]+\.[0-9]+" | sort -V | tail -1
# Mirror url
MIRROR="https://dl-cdn.alpinelinux.org/alpine"
# Repository url
REPO="$MIRROR/$REL/main"
# Get architecture of current machine
ARCH=`uname -m`
# Mount point of for $ROOTFS
MNTDIR="alpine"
# Get apk tools version
APKV=`curl -s $REPO/$ARCH/APKINDEX.tar.gz | tar -Oxz | grep -a '^P:apk-tools-static$' -A1 | tail -1 | cut -d: -f2`
# Download apk tools and put into sbin/
curl -s $REPO/$ARCH/apk-tools-static-${APKV}.apk | tar -xz sbin/apk.static
# Install alpine-base into $MNTDIR
sudo sbin/apk.static --repository $REPO --update-cache --allow-untrusted --root $MNTDIR --initdb add alpine-base
# (Try again if any error)
# A brief form
# sudo sbin/apk.static -X $REPO -U --allow-untrusted -p $MNTDIR --initdb add alpine-base
# Write repository url into apk mirrorlist
sudo sh -c "echo $REPO > $MNTDIR/etc/apk/repositories"
Then put partition table into $MNTDIR/etc/fstab:
sudo sh -c "echo LABEL=$LBL_NAME / auto defaults 1 1 >> $MNTDIR/etc/fstab"
The content of $MNTDIR/etc/fstab
may like this:
#
# /etc/fstab: static file system information
#
# <file system> <dir> <type> <options> <dump> <pass>
/dev/cdrom /media/cdrom iso9660 noauto,ro 0 0
/dev/usbdisk /media/usb vfat noauto,ro 0 0
LABEL=ALPINE_ROOT / auto defaults 1 1
If you have anything need to copy from host to UML, you can do at this stage, but make sure it’s static built. Otherwise the dependent libraries can’t be found.
mkdir $ROOTFS/etc/shadowsocks
# Replace SS_STATIC_PATH to your static shadowsocks path
cp $SS_STATIC_PATH/ss-* $ROOTFS/usr/local/bin
# Replace SS_CFG_PATH to yours
cp $SS_CFG_PATH/config.json $ROOTFS/etc/shadowsocks
Also, you may want to change some system preferences, which be found at $MNTDIR/etc/sysctl.conf
For exmaple, a common used network optimization would be:
# max open files
fs.file-max = 51200
# max read buffer
net.core.rmem_max = 67108864
# max write buffer
net.core.wmem_max = 67108864
# default read buffer
net.core.rmem_default = 65536
# default write buffer
net.core.wmem_default = 65536
# max processor input queue
net.core.netdev_max_backlog = 4096
# max backlog
net.core.somaxconn = 4096
# resist SYN flood attacks
net.ipv4.tcp_syncookies = 1
# reuse timewait sockets when safe
net.ipv4.tcp_tw_reuse = 1
# turn off fast timewait sockets recycling
net.ipv4.tcp_tw_recycle = 0
# short FIN timeout
net.ipv4.tcp_fin_timeout = 30
# short keepalive time
net.ipv4.tcp_keepalive_time = 1200
# outbound port range
net.ipv4.ip_local_port_range = 10000 65000
# max SYN backlog
net.ipv4.tcp_max_syn_backlog = 4096
# max timewait sockets held by system simultaneously
net.ipv4.tcp_max_tw_buckets = 5000
# turn on TCP Fast Open on both client and server side
net.ipv4.tcp_fastopen = 3
# TCP receive buffer
net.ipv4.tcp_rmem = 4096 87380 67108864
# TCP write buffer
net.ipv4.tcp_wmem = 4096 65536 67108864
# turn on path MTU discovery
net.ipv4.tcp_mtu_probing = 1
# BBR
net.core.default_qdisc = fq
net.ipv4.tcp_congestion_control = bbr
When all things done, remember umount the image:
sudo umount $MNTDIR
Making a User-mode Linux boot image(vmlinux)
In this phase, we need to making to User-mode Linux image, you should choose the kernel at your needs, you can find the kernel source at kernel.org.
I highly recommend you guys choose the stable/longterm version. now I assmunig the 4.12.3
is used, it’s a stable version when I writing this article.
Before that, you should install the build dependencies:
# Change to your package manager
sudo pacman -S ncurses bc screen
wget --no-check-certificate https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.12.3.tar.xz
# Extract to current directory(~/uml)
tar xvf linux-4.12.3.tar.xz
cd linux-4.12.3
# Generate a default .config file
make defconfig ARCH=um
# Config through ncursor(CLI)
make menuconfig ARCH=um
You can config by your own flavour(the *
indicates checked), following is a example:
UML-specific options
==> [*] Force a static link
Device Drivers
==> [*] Network device support
==> <*> Universal TUN/TAP device driver support
[*] Networking support
==> Networking options
==> [*] IP: TCP syncookie support
==> [*] TCP: advanced congestion control
==> <*> BBR TCP
==> <*> Default TCP congestion control (BBR)
==> [*] QoS and/or fair queueing
==> <*> Quick Fair Queueing scheduler (QFQ)
==> <*> Controlled Delay AQM (CODEL)
==> <*> Fair Queue Controlled Delay AQM (FQ_CODEL)
==> <*> Fair Queue
After that, you can build up the boot image and strip all symbols:
# Compile UML in nproc jobs(it may takes you some time to finish this)
make -j`nproc` ARCH=um vmlinux
VMLINUX="vmlinux-4.12.3-`uname -m`"
mv vmlinux $VMLINUX
strip -s $VMLINUX
# Move $VMLINUX image into ~/uml
mv $VMLINUX ..
Final steup in host machine
Before you launch the UML, you should make a tunnel to allow internet connection between host and guest.
The following script can done this for you(in host machine):
tap.sh
(Need root privilege)
#!/bin/bash
# [NOTE] Make sure the TUN/TAP is available
if [ $EUID -ne 0 ]; then
echo "[ERROR] Must run as root"
exit 1
fi
# If you want to delete it
# ip tuntap del tap0 mode tap
# The ethernet(prefer) or wireless will be chosen
NET=`ls /sys/class/net | grep [we] | head -1`
# Port reflection range
RNG="8000:10000"
# Host tunnel address
SRC="10.0.0.1"
# UML tunnel address
DST="10.0.0.2"
ip tuntap add tap0 mode tap
ip addr add $SRC/24 dev tap0
ip link set tap0 up
iptables -P FORWARD ACCEPT
iptables -t nat -A POSTROUTING -o $NET -j MASQUERADE
iptables -t nat -A PREROUTING -i $NET -p tcp --dport $RNG -j DNAT --to-destination $DST
iptables -t nat -A PREROUTING -i $NET -p udp --dport $RNG -j DNAT --to-destination $DST
Note that the $RNG
indicates how host ports can get transported into guest, you should not transport all ports into guest, that would be inefficient and affect load balance of guest.
At the same time, you may need to pack all things up to transform them into a remote host using scp
, you can do this:
XZ_NAME="alpine_uml.tar.xz"
# Using xz to pack things up
# Make sure guest machine with xz or xz-utils installed
tar cvfJ $XZ_NAME $ROOTFS $VMLINUX tap.sh uml.sh
scp -P<port> $XZ_NAME root@<remote_ip>:<some_dir>
# extract it in your remote machine(xz is needed)
# tar xvf $XZ_NAME
Start UML in user space
You can run the following script to start your UML up:
uml.sh
(Need root privilege)
#!/bin/bash
if [[ $EUID -ne 0 ]]; then
echo "[ERROR] Must run as root"
exit 1
fi
VMLINUX="vmlinux-4.12.3-`uname -m`"
ROOTFS="alpine_rootfs.img"
# Yes, Alpine Linux can be run with memory 64mb
./$VMLINUX ubda=$ROOTFS rw eth0=tuntap,tap0 mem=64m
# If it prompts you:
# ...
# Checking environment variables for a tempdir...none found
# Checking if /dev/shm is on tmpfs...OK
# Checking PROT_EXEC mmap in /dev/shm...Operation not permitted
# /dev/shm must be not mounted noexec
# You should:
# export TMPDIR=/tmp
After the UML booted up, you could see the output:
Virtual console 1 assigned device ‘/dev/pts/2’
…
Virtual console 6 assigned device ‘/dev/pts/8’
You should use screen
tool to connect the pseudo-tty:
# If cannot find terminfo entry for 'xxx'
# export TERM="xterm-256color"
# The X is any available number of your pts
sudo screen /dev/pts/X
# NOTE:
# Once screen starts up, the terminal turns blank
# Now you should press [Enter] then the login CLI shows up
# And no password for root user
Useful screen shortcuts and commands you may need:
Terminate current
screen
: Control-A + \Detach current
screen
: Control-A + DRestore detached
screen
: screen -r
If you want to shutdown the UML, you can type halt
or poweroff
in Alpine Linux.
Once logged in guest machine, you should run setup-alpine to auto-configure your machine.
Remember to set your:
IP Address for
eth0
:10.0.0.2
Netmask:
255.255.255.0
Gateway:
10.0.0.1
DNS nameserver(s):
8.8.8.8
8.8.4.4
Or configure manually, the first thing is to setup the network, generally there’re three things you need to do:
-
Turn your ethernet device state up
-
Add
$DST
ip address to that ethernet device -
Make the
$SRC
as default router of that ethernet device
And write some DNS servers at /etc/resolv.conf
nameserver 8.8.8.8
nameserver 8.8.4.4
If you with trouble with networking, you may need to restart it:
/etc/init.d/networking restart
One gracefully thing then is to make a swapfile:
# Making a swapfile with 64M(Make sure your $ROOTFS have enough space)
dd if=/dev/zero of=/swapfile bs=1M count=64
chmod 600 /swapfile
Yet, the following script can done all those things for you:
/etc/local.d/local.start
#!/bin/bash
# swap on
/sbin/mkswap /swapfile
/sbin/swapon /swapfile
# fix net(Choose one)
# Auto-configured
/etc/init.d/networking restart
# Manually-configured
/sbin/ip link set eth0 up
/sbin/ip addr add 10.0.0.2/24 dev eth0
/sbin/ip route add default via 10.0.0.1 dev eth0
# shadowsocks-libev
/usr/bin/nohup /usr/local/bin/ss-server -c /etc/shadowsocks-libev/config.json &
Make sure it havs eXecution privilege:
# Add local service to runlevel(Done once)
rc-update add local default
# Files inside /etc/local.d is belongs to local services
chmod a+x /etc/local.d/local.start
If you want to disable some unnecessary(NOT all) pseudo-ttys, you may wan to edit /etc/inittab
to comment out tty*::...
Finally, apk
is the default package manager of Alpine Linux, it’s easy to install a tool:
apk add vim
Hopefully, the $ROOTFS
and the $VMLINUX
are reusable in Linux systems, so you don’t need to rebuild those image and kernel again.
Alpine UML built under Debian 8 x64(3.16.0-4-amd64)
The rootfs.img sized 192M, with 64M swapfile inside
The tap.sh is used to generate a tunnel adapter
The uml.sh is used to launch the Alpine UML
Inside Alpine UML two major tools available:
1) vi
2) shadowsocks-libev
Shadowsocks-libev binary files located at:
/usr/local/bin
also its config file located at:
/etc/shadowsocks-libev/config.json
The startup script located at:
/etc/local.d/local.start
NOTE: This UML cannot be used in any form of production scenario