13.多线程 - 4.线程同步 - 《Blog - 博客笔记》

线程同步
- 不需要 synchronized 的操作
- 小结
同步方法
- 小结

线程同步

当多个线程同时运行时，线程的调度由操作系统决定，程序本身无法决定。因此，任何一个线程都有可能在任何指令处被操作系统暂停，然后在某个时间段后继续执行。
这个时候，有个单线程模型下不存在的问题就来了：如果多个线程同时读写共享变量，会出现数据不一致的问题。
例如：

public class Main {
    public static void main(String[] args) throws Exception {
        Thread add = new AddThread();
        Thread dec = new DecThread();
        add.start();
        dec.start();
        add.join();
        dec.join();
        System.out.println(Counter.count);
    }
}
class Counter {
    public static int count = 0;
}
class AddThread extends Thread {
    public void run() {
        for (int i = 0; i < 10000; i++) {
            Counter.count += 1;
        }
    }
}
class DecThread extends Thread {
    public void run() {
        for (int i = 0; i < 10000; i++) {
            Counter.count -= 1;
        }
    }
}

上面的代码很简单，两个线程同时对一个 int 变量进行操作，一个加 10000 次，一个减 10000 次，最后结果应该是 0，但是，每次运行，结果实际上都是不一样的。
这是因为对变量进行读取和写入时，结果要正确，必须保证是原子操作。原子操作是指不能被中断的一个或一系列操作。
例如，对于语句：

n = n + 1;

看上去是一行语句，实际上对应了3条指令：

ILOAD
IADD
ISTORE

我们假设 n 的值是 100，如果两个线程同时执行 n = n + 1，得到的结果很可能不是 102，而是 101，原因在于：

┌───────┐    ┌───────┐
│Thread1│    │Thread2│
└───┬───┘    └───┬───┘
    │            │
    │ILOAD (100) │
    │            │ILOAD (100)
    │            │IADD
    │            │ISTORE (101)
    │IADD        │
    │ISTORE (101)│
    ▼            ▼

如果线程 1 在执行 ILOAD 后被操作系统中断，此刻如果线程 2 被调度执行，它执行 ILOAD 后获取的值仍然是 100，最终结果被两个线程的 ISTORE 写入后变成了 101，而不是期待的 102。
这说明多线程模型下，要保证逻辑正确，对共享变量进行读写时，必须保证一组指令以原子方式执行：即某一个线程执行时，其他线程必须等待：

┌───────┐     ┌───────┐
│Thread1│     │Thread2│
└───┬───┘     └───┬───┘
    │             │
    │-- lock --   │
    │ILOAD (100)  │
    │IADD         │
    │ISTORE (101) │
    │-- unlock -- │
    │             │-- lock --
    │             │ILOAD (101)
    │             │IADD
    │             │ISTORE (102)
    │             │-- unlock --
    ▼             ▼

通过加锁和解锁的操作，就能保证 3 条指令总是在一个线程执行期间，不会有其他线程会进入此指令区间。即使在执行期线程被操作系统中断执行，其他线程也会因为无法获得锁导致无法进入此指令区间。只有执行线程将锁释放后，其他线程才有机会获得锁并执行。这种加锁和解锁之间的代码块我们称之为临界区（Critical Section），任何时候临界区最多只有一个线程能执行。
可见，保证一段代码的原子性就是通过加锁和解锁实现的。Java 程序使用 synchronized 关键字对一个对象进行加锁：

synchronized(lock) {
    n = n + 1;
}

synchronized 保证了代码块在任意时刻最多只有一个线程能执行。将上面的例子用 synchronized 改写如下：

public class Main {
    public static void main(String[] args) throws Exception {
        Thread add = new AddThread();
        Thread dec = new DecThread();
        add.start();
        dec.start();
        add.join();
        dec.join();
        System.out.println(Counter.count);
    }
}
class Counter {
    public static final Object lock = new Object();
    public static int count = 0;
}
class AddThread extends Thread {
    public void run() {
        for (int i = 0; i < 10000; i++) {
            synchronized (Counter.lock) { // lock
                Counter.count += 1;
            }
        }
    }
}
class DecThread extends Thread {
    public void run() {
        for (int i = 0; i < 10000; i++) {
            synchronized (Counter.lock) { // lock
                Counter.count -= 1;
            }
        }
    }
}

注意到代码：

synchronized(Counter.lock) { // 获取锁
    ...
} // 释放锁

它表示用 Counter.lock 实例作为锁，两个线程在执行各自的 synchronized(Counter.lock) { ... } 代码块时，必须先获得锁，才能进入代码块进行。执行结束后，在 synchronized 语句块结束会自动释放锁。这样一来，对 Counter.count 变量进行读写就不可能同时进行。上述代码无论运行多少次，最终结果都是 0。
使用 synchronized 解决了多线程同步访问共享变量的正确性问题。但是，它的缺点是带来了性能下降。因为 synchronized 代码块无法并发执行。此外，加锁和解锁需要消耗一定的时间，所以，synchronized 会降低程序的执行效率。
总结一下如何使用 synchronized：

找出修改共享变量的线程代码块；
选择一个共享实例作为锁；
使用 synchronized(lockObject) { ... }。

synchronized 除了加锁外，还具有内存屏障功能，并且强制读取所有共享变量的主内存最新值，退出 synchronized 时再强制回写主内存（如果有修改）

在使用 synchronized 的时候，不必担心抛出异常。因为无论是否有异常，都会在 synchronized 结束处正确释放锁：

public void add(int m) {
    synchronized (obj) {
        if (m < 0) {
            throw new RuntimeException();
        }
        this.value += m;
    } // 无论有无异常，都会在此释放锁
}

下面来看一个错误使用 synchronized 的例子：

public class Main {
    public static void main(String[] args) throws Exception {
        Thread add = new AddThread();
        Thread dec = new DecThread();
        add.start();
        dec.start();
        add.join();
        dec.join();
        System.out.println(Counter.count);
    }
}
class Counter {
    public static final Object lock1 = new Object();
    public static final Object lock2 = new Object();
    public static int count = 0;
}
class AddThread extends Thread {
    public void run() {
        for (int i = 0; i < 10000; i++) {
            synchronized (Counter.lock1) { // get lock1
                Counter.count += 1;
            }
        }
    }
}
class DecThread extends Thread {
    public void run() {
        for (int i = 0; i < 10000; i++) {
            synchronized (Counter.lock2) { // get lock2
                Counter.count -= 1;
            }
        }
    }
}

结果并不是 0，这是因为两个线程各自的 synchronized 锁住的不是同一个对象！这使得两个线程各自都可以同时获得锁：因为 JVM 只保证同一个锁在任意时刻只能被一个线程获取，但两个不同的锁在同一时刻可以被两个线程分别获取。
因此，使用 synchronized 的时候，获取到的是哪个锁非常重要。锁对象如果不对，代码逻辑就不对。
再来看一个例子：

public class Main {
    public static void main(String[] args) throws Exception {
        Thread[] ts = new Thread[]{new AddStudentThread(), new DecStudentThread(), new AddTeacherThread(),
                new DecTeacherThread()};
        for (Thread t : ts) {
            t.start();
        }
        for (Thread t : ts) {
            t.join();
        }
        System.out.println(Counter.studentCount);
        System.out.println(Counter.teacherCount);
    }
}
class Counter {
    public static final Object lock = new Object();
    public static int studentCount = 0;
    public static int teacherCount = 0;
}
class AddStudentThread extends Thread {
    public void run() {
        for (int i = 0; i < 10000; i++) {
            synchronized (Counter.lock) {
                Counter.studentCount += 1;
            }
        }
    }
}
class DecStudentThread extends Thread {
    public void run() {
        for (int i = 0; i < 10000; i++) {
            synchronized (Counter.lock) {
                Counter.studentCount -= 1;
            }
        }
    }
}
class AddTeacherThread extends Thread {
    public void run() {
        for (int i = 0; i < 10000; i++) {
            synchronized (Counter.lock) {
                Counter.teacherCount += 1;
            }
        }
    }
}
class DecTeacherThread extends Thread {
    public void run() {
        for (int i = 0; i < 10000; i++) {
            synchronized (Counter.lock) {
                Counter.teacherCount -= 1;
            }
        }
    }
}

上述代码的 4 个线程对两个共享变量分别进行读写操作，但是使用的锁都是 Counter.lock 这一个对象，这就造成了原本可以并发执行的 Counter.studentCount += 1 和 Counter.teacherCount += 1，现在无法并发执行了，执行效率大大降低。实际上，需要同步的线程可以分成两组：AddStudentThread 和 DecStudentThread，AddTeacherThread 和 DecTeacherThread，组之间不存在竞争，因此，应该使用两个不同的锁，即：
AddStudentThread 和 DecStudentThread 使用 lockStudent 锁：

synchronized(Counter.lockStudent) {
    ...
}

AddTeacherThread 和 DecTeacherThread 使用 lockTeacher 锁：

synchronized(Counter.lockTeacher) {
    ...
}

这样才能最大化地提高执行效率。
使用 synchronized 关键字来创建锁存在一些局限性：

不能中断一个正在试图获得锁的线程。
试图获得锁时不能设定超时。而导致申请一个锁时，失败后只能等待而发生阻塞。
每个锁仅有单一的条件，可能是不够的。而 ReentrantLock 可以创建多个 Condition。

不需要 synchronized 的操作
JVM 规范定义了几种原子操作：
基本类型（long 和 double除外）赋值，例如：int n = m；
引用类型赋值，例如：List list = anotherList。

long 和 double 是 64 位数据，JVM 没有明确规定 64 位赋值操作是不是一个原子操作，不过在 x64 平台（64 位）的 JVM 是把 long 和 double 的赋值作为原子操作实现的。
单条原子操作的语句不需要同步。例如：

public void set(int m) {
    synchronized(lock) {
        this.value = m;
    }
}

就不需要同步。
对引用也是类似。例如：

public void set(String s) {
    this.value = s;
}

上述赋值语句并不需要同步
但是，如果是多行赋值语句，就必须保证是同步操作，例如：

class Pair {
    int first;
    int last;
    public void set(int first, int last) {
        synchronized(this) {
            this.first = first;
            this.last = last;
        }
    }
}

有些时候，通过一些巧妙的转换，可以把非原子操作变为原子操作。例如，上述代码如果改造成：

class Pair {
    int[] pair; // 可以加上 volatile 使得立即同步到主内存，但 x86 加与否区别不大
    public void set(int first, int last) {
        int[] ps = new int[] { first, last };
        this.pair = ps;
    }
}

就不再需要同步，因为 this.pair = ps 是引用赋值的原子操作。而语句：

int[] ps = new int[] { first, last };

这里的 ps 是方法内部定义的局部变量，每个线程都会有各自的局部变量，互不影响，并且互不可见，并不需要同步。

小结

多线程同时读写共享变量时，会造成逻辑错误，因此需要通过 synchronized 同步；
同步的本质就是给指定对象加锁，加锁后才能继续执行后续代码；
注意加锁对象必须是同一个实例；
对 JVM 定义的单个原子操作不需要同步。

同步方法

Java 程序依靠 synchronized 对线程进行同步，使用 synchronized 的时候，锁住的是哪个对象非常重要。
让线程自己选择锁对象往往会使得代码逻辑混乱，也不利于封装。更好的方法是把 synchronized 逻辑封装起来。例如，我们编写一个计数器如下：

public class Counter {
    private int count = 0;
    public void add(int n) {
        synchronized(this) {
            count += n;
        }
    }
    public void dec(int n) {
        synchronized(this) {
            count -= n;
        }
    }
    public int get() {
        return count;
    }
}

这样一来，线程调用 add()、dec() 方法时，它不必关心同步逻辑，因为 synchronized 代码块在 add()、dec() 方法内部。并且，我们注意到，synchronized 锁住的对象是 this，即当前实例，这又使得创建多个 Counter 实例的时候，它们之间互不影响，可以并发执行：

Counter c1 = Counter();
Counter c2 = Counter();
// 对 c1 进行操作的线程:
new Thread(() -> {
    c1.add();
}).start();
new Thread(() -> {
    c1.dec();
}).start();
// 对 c2 进行操作的线程:
new Thread(() -> {
    c2.add();
}).start();
new Thread(() -> {
    c2.dec();
}).start();

现在，对于 Counter 类，多线程可以正确调用。
如果一个类被设计为允许多线程正确访问，我们就说这个类就是「线程安全」的（thread-safe），上面的 Counter 类就是线程安全的。Java 标准库的 java.lang.StringBuffer 也是线程安全的。
还有一些不变类，例如 String，Integer，LocalDate，它们的所有成员变量都是 final，多线程同时访问时只能读不能写，这些不变类也是线程安全的。
最后，类似 Math 这些只提供静态方法，没有成员变量的类，也是线程安全的。
除了上述几种少数情况，大部分类，例如 ArrayList，都是非线程安全的类，我们不能在多线程中修改它们。但是，如果所有线程都只读取，不写入，那么 ArrayList 是可以安全地在线程间共享的。

没有特殊说明时，一个类默认是非线程安全的。

当我们锁住的是 this 实例时，实际上可以用 synchronized 修饰这个方法。下面两种写法是等价的：

public void add(int n) {
    synchronized(this) { // 锁住this
        count += n;
    } // 解锁
}

public synchronized void add(int n) { // 锁住this
    count += n;
} // 解锁

因此，用synchronized修饰的方法就是同步方法，它表示整个方法都必须用this实例加锁。
我们再思考一下，如果对一个静态方法添加 synchronized 修饰符，它锁住的是哪个对象？

public synchronized static void test(int n) {
    ...
}

对于 static 方法，是没有 this 实例的，因为 static 方法是针对类而不是实例。但是我们注意到任何一个类都有一个由 JVM 自动创建的 Class 实例，因此，对 static 方法添加 synchronized，锁住的是该类的 Class 实例。上述 synchronized static 方法实际上相当于：

public class Counter {
    public static void test(int n) {
        synchronized(Counter.class) {
            ...
        }
    }
}

我们再考察 Counter 的 get() 方法：

public class Counter {
    private int count;
    public int get() {
        return count;
    }
    ...
}

它没有同步，因为读一个 int 变量不需要同步。
然而，如果我们把代码稍微改一下，返回一个包含两个 int 的对象：

public class Counter {
    private int first;
    private int last;
    public synchronized Pair get() { // lock
        Pair p = new Pair();
        p.first = first;
        p.last = last;
        return p;
    }
    ...
}

就必须要同步了。

小结

用 synchronized 修饰方法可以把整个方法变为同步代码块，synchronized 方法加锁对象是 this；
通过合理的设计和数据封装可以让一个类变为“线程安全”；
一个类没有特殊说明，默认不是 thread-safe；
多线程能否安全访问某个非线程安全的实例，需要具体问题具体分析。

4.线程同步

线程同步

不需要 synchronized 的操作

小结

同步方法

小结