rust学习笔记

内存安全

段错误

Rust is a systems programming language that runs blazingly fast,prevents segfaults,and guarantees thread safety.

内存不安全的行为:

  • 空指针
  • 野指针
  • 悬空指针
  • 使用未初始化的指针
  • 非法释放
  • 缓冲区溢出
  • 执行非法函数指针
  • 数据竞争

不认为是内存安全的行为:

  • 内存泄漏


所有权

  1. Rust 中的每一个值有且只有一个所有者。
  2. 当所有者(变量)离开作用域,这个值将被丢弃。

借用于所有权

  1. 一个变量可以存在多个只读借用(shared reference),但是实际上又可以有其他手段来修改
  2. 一个变量只能存在一个可写借用,可写借用可以修改变量的内容,但是不能转移所有权

    Copy Trait

    编译时大小已知,并且存储在栈上(值类型)一般都有copy trait,称之为浅拷贝.

  • 元组,当且仅当其包含的类型也都是 Copy 的时候。比如,(i32, i32) 是 Copy 的,但 (i32, String) 就不是。

Copy和Clone的区别

Copy是浅拷贝,是编译器自动调用,Clone是深拷贝,程序员手工调用.

引用与解引用(dereferencing)

& 引用实际上可以认为是所有权的临时借用 (不全是c语言中的指针,只不过用起来像而已)

  • 解引用

引用有两种 不可变引用与可变引用.

  • 在任意给定时间,要么 只能有一个可变引用,要么 只能有多个不可变引用。
  • 引用必须总是有效。

Slice类型

  • 他没有数据所有权
  • 他只是对堆上数据的引用


结构体

struct

struct User {
    username: String,
    email: String,
    sign_in_count: u64,
    active: bool,
}
  1. #[drive(Debug)] 可以方便打印调试信息 {:?}
  2. 定义在结构体上的函数称之为方法 (第一个参数是&self或者&mut self)
  3. &self(& mut self) 类似于this指针
  4. 关联函数,定义在struct上,但是第一个参数不是&self(类似于c++类的静态函数)


###元组(tuple)
元组是没有字段名的结构体

struct Color(i32, i32, i32);

枚举

枚举是一个很多语言都有的功能,不过不同语言中其功能各不相同。Rust 的枚举与 F#、OCaml 和 Haskell 这样的函数式编程语言中的 代数数据类型(algebraic data types)最为相似。

与C/go语言中的枚举完全不同,没有对应的整数值.

一个典型的例子:

enum Message {
    Quit,
    Move { x: i32, y: i32 },
    Write(String),
    ChangeColor(i32, i32, i32),
}

Option

enum Option<T> {
    Some(T),
    None,
}

match必须是穷尽的

if let是一种专门的用法

crate,mod

  • 模块,一个组织代码和控制路径私有性的方式
  • 路径,一个命名项(item)的方式
  • use 关键字用来将路径引入作用域,super关键字表示父模块
  • pub 关键字使项变为公有
  • as 关键字用于将项引入作用域时进行重命名
  • 使用外部包
  • 嵌套路径用来消除大量的 use 语句
  • 使用 glob 运算符将模块的所有内容引入作用域
  • 如何将不同模块分割到单独的文件中

  • 通过pub use重导出

  • 通过cargo.toml中[dependencies]来使用外部包

rust高级部分

unsafe

  • 解引用裸指针
  • 调用不安全的函数或方法
  • 访问或修改可变静态变量
  • 实现不安全 trait
    如果不必要使用unsafe的时候用了unsfae关键字也会错误,也就是说没有出现上述四种情况,但是用了unsafe.

Cell 用法

use std::cell::Cell;

fn main() {

let data : Cell<i32> = Cell::new(100); let p = &data; vdata.set(10); println!("{}", p.get());

p.set(20); println!("{:?}", data);

}

所谓的内部可变性,就是不用mut,一样可以修改Cell包括的内容.
cell导出接口

impl<T> Cell<T> {
    //这个要求是可写借用,其他都是只读借用,但是还是可以修改其内容
pub fn get_mut(&mut self) -> &mut T {}
pub fn set(&self, val: T) { } 
pub fn swap(&self, other: &Self) { }
pub fn replace(&self, val: T) -> T { }
pub fn into_inner(self) -> T { }
//Copy实际上就是get
impl<T:Copy> Cell<T> { 
    pub fn get(&self) -> T { }
}

RefCell接口的定义

impl<T: ?Sized> RefCell<T> {

pub fn borrow(&self) -> Ref<T> { } 
pub fn try_borrow(&self) -> Result<Ref<T>,BorrowError> {}
 pub fn borrow_mut(&self) -> RefMut<T> { } 
 pub fn try_borrow_mut(&self) -> Result<RefMut<T>, BorrowMutError> { }
     pub fn get_mut(&mut self) -> &mut T { }

他和cell不一样的是,他并不能直接修改或者读取其包含的内容,而是通过borrow以及borrow_mut来获取其对应的Ref和RefMut

如何选择Cell和RefCell?
如果你只需要整体性地存⼊、取出T,那么就选 Cell。如果你需要有个可读写指针指向这个T修改它,那么就选RefCell。

借用规则

  1. 借用指针不能比它指向的变量存在更长的时间
  2. &mut型借用只能指向本身具有mut修饰的变量
  3. &mut型借用指针存在的时候,被借用变量出于冻结状态.
  4. &型借用和&mut型借用互斥,不可同时存在.
  5. 最多同时存在一个&mut型借用
  6. 可以存在在没有&mut型借用的情况下,存在多个&型借用.


解引用

Rust会自动解引用,这看起来就像类型自动转换一样.
比如Vec实现了deref
因此&Vec即可以当做&[T]来用,当然也可以当做

use std::rc::Rc;

fn main() { let s = Rc::new(String::from("hello")); println!("{:?}", s.bytes()); }

这个例子中s类型为Rc并没有bytes函数,Rust会自动解引用,得到str类型,然后调用上面的bytes函数.
这个过程是这样的

  1. 尝试Rc::bytes(&s),不可行继续2
  2. 尝试String::bytes(Rc::deref(&s)),不可行,继续3
  3. 尝试str::bytes(String::deref(Rc::deref(&s))),ok结束
    实际执行的是s.deref().deref().bytes()

编译器会自动无限deref尝试,直到不能deref下去. 如果不能自动判断,需要手工介入

&*和& * 是不一样的

let s=Box::new(String::new());

&*s会被直接当成s.deref(),而不是*s先把内部数据移走然后去借用

宏展开方法

cargo rustc -- -Z unstable-options --pretty=expanded

临时变量的生命周期

来自于这篇讨论借用一个临时变量是有效的

Temporary lifetimes

When using a value expression in most place expression contexts, a temporary unnamed memory location is created initialized to that value and the expression evaluates to that location instead

This applies, because String::new() is a value expression and being just below &mut it is in a place expression context. Now the reference operator only has to pass through this temporary memory location, so it becomes the value of the whole right side (including the &mut).

When a temporary value expression is being created that is assigned into a let declaration, however, the temporary is created with the lifetime of the enclosing block instead

Since it is assigned to the variable it gets a lifetime until the end of the enclosing block.

This also answers this question about the difference between

let a = &String::from("abcdefg"); // ok!

and

let a = String::from("abcdefg").as_str(); // compile error 这样之所以不行是因为as_str()借用的生命周期和a相同,但是被借用对象string::from()返回的临时变量的生命周期依然是这一行

In the second variant the temporary is passed into as_str(), so its lifetime ends at the end of the statement.

通俗来说,就是临时变量的生命周期就是这一行语句,但是如果返回的变量使用let绑定的话,那么其生命周期就变成了最近的那个

'static 泛型

若是有where T:'static 的约束,意思则是,类型T⾥⾯不包含任何指向短⽣命周期的借⽤指针, 意思是要么完全不包含任何借⽤,要么可以有指向'static的借⽤指针。

函数生命周期的推导规则

  • 每个带⽣命周期参数的输⼊参数,每个对应不同的⽣命周期参数;
  • 如果只有⼀个输⼊参数带⽣命周期参数,那么返回值的⽣命周期被指 定为这个参数;
  • 如果有多个输⼊参数带⽣命周期参数,但其中有&self、&mut self, 那么返回值的⽣命周期被指定为这个参数;
  • 以上都不满⾜,就不能⾃动补全返回值的⽣命周期参数

高阶生命周期

到⽬前为 ⽌,for<'a>Fn(&'a Arg)->&'a Ret这样的语法,只能⽤于⽣命周期参数, 不能⽤于任意泛型类型。

fn calc_by<'a, F>(var: &'a i32, f: F) -> i32 where F: for<'f> Fn(&'f i32) -> i32 {
    let local = *var;
    f(&local) 
}

Captures are written as a dollar ($) followed by an identifier, a colon (:), and finally the kind of capture, which must be one of the following:

  • item: an item, like a function, struct, module, etc.
  • block: a block (i.e. a block of statements and/or an expression, surrounded by braces)
  • stmt: a statement
  • pat: a pattern
  • expr: an expression
  • ty: a type
  • ident: an identifier
  • path: a path (e.g. foo, ::std::mem::replace, transmute::<_, int>, …)
  • meta: a meta item; the things that go inside #[...] and #![...] attributes
  • tt: a single token tree

  • item: anything.

  • block: anything.

  • stmt: => , ;

  • pat: => , = if in

  • expr: => , ;

  • ty: , => : = > ; as

  • ident: anything.

  • path: , => : = > ; as

  • meta: anything.

  • tt: anything.

宏调试

1. trace_macros

trace_macros!(true);
设置以后能够将宏展开, 打印每一个步骤.
trace_macros!(false); 停止打印

2. log_syntax!

此宏能够在终端打印传递给他的每一个token,方便调试

3. 宏展开

rustc -Z unstable-options --pretty expanded hello.rs

关于线程安全

send && Sync

Rust提供了Send和Sync两个标签trait,它们是Rust⽆数据竞争并发的基⽯。

  • 实现了Send的类型,可以安全地在线程间传递值,也就是说可以跨线程传递所有权。
  • 实现了Sync的类型,可以跨线程安全地传递共享(不可变)引⽤(&T)。
    典型的没有实现Sync的例子:
    Cell和RefCell (多个线程同时读会有问题)
    典型的没有实现Send的例子:
    Rc

关于async和await

await

如果Executor是多线程的,那么每一个.await都可能引发任务在不同线程之间发送
因此任务必须实现Send+Sync

Similarly, it isn't a good idea to hold a traditional non-futures-aware lock across an .await, as it can cause the threadpool to lock up: one task could take out a lock, .await and yield to the executor, allowing another task to attempt to take the lock and cause a deadlock. To avoid this, use the Mutex in futures::lock rather than the one from std::sync.

在Excutor中也是不适宜使用普通的lock,要使用futures提供的锁

await和await代码的翻译

一个简单的例子,不考虑复杂的

let fut_one = ...;
let fut_two = ...;
async move {
    fut_one.await;
    fut_two.await;
}

这段代码最终会被翻译成如下:


// The `Future` type generated by our `async { ... }` block
struct AsyncFuture {
    fut_one: FutOne,
    fut_two: FutTwo,
    state: State,
}

// List of states our `async` block can be in
enum State {
    AwaitingFutOne,
    AwaitingFutTwo,
    Done,
}

impl Future for AsyncFuture {
    type Output = ();

    fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<()> {
        loop {
            match self.state {
                State::AwaitingFutOne => match self.fut_one.poll(..) {
                    Poll::Ready(()) => self.state = State::AwaitingFutTwo,
                    Poll::Pending => return Poll::Pending,
                }
                State::AwaitingFutTwo => match self.fut_two.poll(..) {
                    Poll::Ready(()) => self.state = State::Done,
                    Poll::Pending => return Poll::Pending,
                }
                State::Done => return Poll::Ready(()),
            }
        }
    }
}

Pinning

Pin主要是给借用服务的&T和&mut T,确保被借用的对象不被移动.

借用的使用例子

async {
    let mut x = [0; 128];
    let read_into_buf_fut = read_into_buf(&mut x);
    read_into_buf_fut.await;
    println!("{:?}", x);
}

这部分代码如何翻译呢?

struct ReadIntoBuf<'a> {
    buf: &'a mut [u8], // points to `x` below
}

struct AsyncFuture {
    x: [u8; 128],
    read_into_buf_fut: ReadIntoBuf<'what_lifetime?>,
}

关键问题是&T本质上是一个指针,那么AsyncFuture随时可能被移动,那么ReadIntoBuf中的这个&T指针肯定会失效
因此需要使用Pin来保存.
Pin有几种Pin<&mut T>, Pin<&T>, Pin>,主要是确保对应的T不会被移动.

Stream

理解Stream的一个关键就是,返回Ready,里面可能是Some,也可能是None,如果是None,表示Stream关闭了.

trait Stream {
    /// The type of the value yielded by the stream.
    type Item;

    /// Attempt to resolve the next item in the stream.
    /// Retuns `Poll::Pending` if not ready, `Poll::Ready(Some(x))` if a value
    /// is ready, and `Poll::Ready(None)` if the stream has completed.
    fn poll_next(self: Pin<&mut Self>, cx: &mut Context<'_>)
        -> Poll<Option<Self::Item>>;
}
async fn send_recv() {
    const BUFFER_SIZE: usize = 10;
    let (mut tx, mut rx) = mpsc::channel::<i32>(BUFFER_SIZE);

    tx.send(1).await.unwrap();
    tx.send(2).await.unwrap();
    drop(tx);

    // `StreamExt::next` is similar to `Iterator::next`, but returns a
    // type that implements `Future<Output = Option<T>>`.
    assert_eq!(Some(1), rx.next().await);
    assert_eq!(Some(2), rx.next().await);
    assert_eq!(None, rx.next().await);
}

atmoic

atomic Ordering总共有五种顺序

  1. 排序一致性顺序: SeqCst。
  2. 自由顺序: Relaxed (感觉类似c语言中的volatile)
  3. 其他: Release,Acquire,AcqRel
    Rust支持的5种内存顺序与其底层的LLVM支持的内存顺序是一致的

Release-Acquire ordering

  在这种模型下,store()使用memory_order_release,而load()使用memory_order_acquire。这种模型有两种效果,第一种是可以限制 CPU 指令的重排:

在store()之前的所有读写操作,不允许被移动到这个store()的后面。
在load()之后的所有读写操作,不允许被移动到这个load()的前面。
参考理解 C++ 的 Memory Order

std::atomic<bool> ready{ false };
int data = 0;
void producer()
{
    data = 100;                                       // A
    ready.store(true, std::memory_order_release);     // B
}
void consumer()
{
    while (!ready.load(std::memory_order_acquire))    // C
        ;
    assert(data == 100); // never failed              // D
}
memory_order_relaxed

Relaxed operation: there are no synchronization or ordering constraints imposed on other reads or writes, only this operation's atomicity is guaranteed (see Relaxed ordering below)

memory_order_consume

A load operation with this memory order performs a consume operation on the affected memory location: no reads or writes in the current thread dependent on the value currently loaded can be reordered before this load. Writes to data-dependent variables in other threads that release the same atomic variable are visible in the current thread. On most platforms, this affects compiler optimizations only (see Release-Consume ordering below)

memory_order_acquire

A load operation with this memory order performs the acquire operation on the affected memory location: no reads or writes in the current thread can be reordered before this load. All writes in other threads that release the same atomic variable are visible in the current thread (see Release-Acquire ordering below)

memory_order_release

A store operation with this memory order performs the release operation: no reads or writes in the current thread can be reordered after this store. All writes in the current thread are visible in other threads that acquire the same atomic variable (see Release-Acquire ordering below) and writes that carry a dependency into the atomic variable become visible in other threads that consume the same atomic (see Release-Consume ordering below).

memory_order_acq_rel

A read-modify-write operation with this memory order is both an acquire operation and a release operation. No memory reads or writes in the current thread can be reordered before or after this store. All writes in other threads that release the same atomic variable are visible before the modification and the modification is visible in other threads that acquire the same atomic variable.

memory_order_seq_cst

A load operation with this memory order performs an acquire operation, a store performs a release operation, and read-modify-write performs both an acquire operation and a release operation, plus a single total order exists in which all threads observe all modifications in the same order (see Sequentially-consistent ordering below)

Sized和?Sized

?Sized对于T的约束主要是指可以是固定大小类型也可以说DST,对于DST来说,可以定义
但是用的时候只能是指针,比如下面的例子.

#[derive(Debug)]
struct FooSized<'a,T: ?Sized>(&'a T);

 use std::cell::UnsafeCell;

fn main() {
    let h_s  = FooSized("hello");
    println!("{:?}", h_s);
    let _s=UnsafeCell::new(h_s.0);
}


Borrow 和AsRef两个Trait的关系

pub trait AsRef<T: ?Sized> {
    fn as_ref(&self) -> &T;
}
pub trait Borrow<Borrowed: ?Sized> {
    fn borrow(&self) -> &Borrowed;
}

从形式上看,他们是完全一样的,只是出于设计的目的不一样,不同的名字是让使用者在不同的场景下使用.

Choose Borrow when you want to abstract over different kinds of borrowing, or when you’re building a data structure that treats owned and borrowed values in equivalent ways, such as hashing and comparison.

Choose AsRef when you want to convert something to a reference directly, and you’re writing generic code.

PhantomData的一些用法

在看tokio的代码中发现的,

pub(crate) struct CachedParkThread {
    _anchor: PhantomData<Rc<()>>,
}
/// Used to ensure the invariants are respected
struct GenerationGuard<'a> {
    /// Worker reference
    worker: &'a Worker,

    /// Prevent `Sync` access
    _p: PhantomData<Cell<()>>,
}
pub(crate) struct Enter {
    _p: PhantomData<RefCell<()>>,
}

这里的用法目的是阻止Send,Sync的自动实现,防止这些结构体跨线程传递.

xxxxxxxsgg dsgggg 本站总访问量 本站访客数人次