rust-library-core

· 3514 words · 8 minute read

rust library core 🔗

读一下 rust-lang的源码, core正好是不依赖于其他rust-lang的workspce里其他member, 自给自足所以优先查看。

基本常识common sense 🔗

基本常识:有助先bfs看一下目录和范围,然后找一个module dfs看下去
每个module下的mod.rs优先看 -> mod.rs内部优先看pub use的行

docs.rs <--> 代码。

docs.rs上只会出现pub fn, pub struct, pub trait。


unfold fold all 收拢代码细节 展开代码细节

pub(crate) struct NeverShortCircuit(pub T); item 可以在本crate里普遍使用。

core 和 alloc 的联系和区别是什么?

我们会关注的模块catch module 🔗

[mem], [ptr], [macros], [task], [future], [iter], [], [], [], [], [], [], [], [], [], [], [], [],

我们不关心的模块ignore module 🔗

[ffi], [hash], [num], [unicode]


[mem] module 🔗

这个module是关于内存操作相关的接口。

  • mod.rs
  • manually_drop.rs
  • maybe_uninit.rs
  • transmutability.rs

mod.rs 🔗

pub use crate::intrinsics::transmute;

pub const fn size_of<T>() -> usize {
    intrinsics::size_of::<T>()
}
//Returns the size of the pointed-to value in bytes.
pub const fn size_of_val<T: ?Sized>(val: &T) -> usize {
    // SAFETY: `val` is a reference, so it's a valid raw pointer
    unsafe { intrinsics::size_of_val(val) }
}

// 对于[slice], [trait object] 
pub const unsafe fn size_of_val_raw<T: ?Sized>(val: *const T) -> usize {
    // SAFETY: the caller must provide a valid raw pointer
    unsafe { intrinsics::size_of_val(val) }
}
// example
let x: [u8; 13] = [0; 13];
let y: &[u8] = &x;
println!("slice size={}", std::mem::size_of_val(&y)); // 16
println!("slice pointer to size = {}", std::mem::size_of_val(y)); //13
pub const fn swap<T>(x: &mut T, y: &mut T) {...}
pub(crate) const fn swap_simple<T>(x: &mut T, y: &mut T) {...}

... RTFM

manually_drop.rs 🔗

//src/mem/manually_drop.rs#
pub struct ManuallyDrop<T: ?Sized> {
    value: T,
}

maybe_uninit.rs...transmutability.rs

///core/src/intrinsics.rs
    #[rustc_diagnostic_item = "transmute"]
    pub fn transmute<Src, Dst>(src: Src) -> Dst;

[ptr] module 🔗

core/src/ptr
Manually manage memory through raw pointers.

mod.rs 🔗

/// ```
/// let mut x = 0;
/// let y = &mut x as *mut i32;
/// let z = 12;
///
/// unsafe {
///     std::ptr::write(y, z);
///     assert_eq!(std::ptr::read(y), 12);
/// }
/// ```

mut_ptr.rs 🔗

    /// let mut s = [1, 2, 3];
    /// let ptr: *mut u32 = s.as_mut_ptr();
    ///
    /// unsafe {
    ///     println!("{}", *ptr.offset(1));
    ///     println!("{}", *ptr.offset(2));
    /// }
    pub const unsafe fn offset(self, count: isize) -> *mut T
    where
        T: Sized,
    {
        // SAFETY: the caller must uphold the safety contract for `offset`.
        // The obtained pointer is valid for writes since the caller must
        // guarantee that it points to the same allocated object as `self`.
        unsafe { intrinsics::offset(self, count) as *mut T }
    }

    /// ```
    /// let s: &str = "123";
    /// let ptr: *const u8 = s.as_ptr();
    ///
    /// unsafe {
    ///     println!("{}", *ptr.add(1) as char);
    ///     println!("{}", *ptr.add(2) as char);
    /// }
    /// ```

non_null.rs 🔗

//core/src/ptr/non_null.rs

pub struct NonNull<T: ?Sized> {
    pointer: *const T,
}

unique.rs 🔗

//core/src/ptr/unique.rs

pub struct Unique<T: ?Sized> {
    pointer: NonNull<T>,
    _marker: PhantomData<T>,
}

[macros] module 🔗

  • mod.rs
  • panic.md
    • unwrap() None, Err -> panic!();
    • panic! vs Result<T, E>

[alloc] module 🔗

Memory allocation APIs

  • mod.rs
  • global.rs
  • layout.rs

[array] module 🔗

Utilities for the array primitive type.

  • mod.rs
  • iter.rs
  • equality.rs

mod.rs 🔗

//core/src/array/mod.rs
mod iter;
pub use iter::IntoIter; // outside. use std::array::IntoIter;

iter.rs 🔗

//core/src/array/iter.rs
pub struct IntoIter<T, const N: usize> {
    /// This is the array we are iterating over.
    data: [MaybeUninit<T>; N],
    /// The elements in `data` that have not been yielded yet.
    alive: IndexRange,
}

equality.rs 🔗

[slice] module 🔗

mod.rs 🔗

iter.rs module in [slice] module 🔗

[async_iter] module 🔗


[iter] module 🔗

函数式语言的特点: 迭代器。在这个module下实现了。

mod.rs 🔗

组织外部可见的迭代相关的功能。 Composable external iteration.
trait Iterator {...} 是整个迭代功能妥妥的C位
碰到集合类型时想到iter module里有一些方便的操作定义在Iterator trait中。

//! The heart and soul of this module is the [`Iterator`] trait. The core of
//! [`Iterator`] looks like this:
//!
//! ```
//! trait Iterator {
//!     type Item;
//!     fn next(&mut self) -> Option<Self::Item>;
//! }

如何实现去实现Iterator trait?

//! # Implementing Iterator
//!
//! Creating an iterator of your own involves two steps: 1 creating a `struct` to
//! hold the iterator's state, and 2 then implementing [`Iterator`] for that `struct`.
//! This is why there are so many `struct`s in this module: there is one for
//! each iterator and iterator adapter.

单独地掉Iterator::next方法有点重复且乏味。 所以提供了语法糖(内部还是调用next)来进行迭代。

//! There are three common methods which can create iterators from a collection:
//!
//! * `iter()`, which iterates over `&T`.
//! * `iter_mut()`, which iterates over `&mut T`.
//! * `into_iter()`, which iterates over `T`.
let values = vec![1, 2, 3, 4, 5];
for x in values {
    println!("{x}");
}
//  IntoIterator trait: into_iter() convert collection (like vector) into an Iterator.
// desugar
let values = vec![1, 2, 3, 4, 5];
{
    let result = match IntoIterator::into_iter(values) {
        mut iter => loop{
            let next;
            match iter.next() {
                Some(Val) =>  next = val;
                None => break;
            };
            let x= next;
            let () = {println!("{x}");};
        }
    };
    result
}
// 所有迭代器都实现了IntoIterator, 意味着: 
//   1 Iterator类型的迭代器可以在for 循环上使用;
//   2 集合colllection实现了IntoIterator, 集合可以在for循环上使用。
impl<I: Iterator> IntoIterator for I

into_iter() 会消费所有权, ref不会。

Iterating by reference, iter(), iter_mut()的实现。 分散在collection自己的module中, 例如 Slice module。

迭代器adaptor是可以组合的。
例如:map, filter, take, 输入一个Iterator,返回一个Iterator。
所有的适配器都是lazy的。 调用collect执行迭代返回新集合。

[sources] module in [iter] module 🔗

sources文件夹下提供了一些迭代器的实现,例如通过闭包来创建一个Iterator。

empty.rs module 🔗

创建一个迭代器,不产生value。 Creates an iterator that yields nothing

//example
use std::iter;
let nope = iter::empty::<i32>();
assert_eq!(None, nope);
pub const fn empty<T>() -> Empty<T> {
    Empty(marker::PhantomData)
}
pub struct Empty<T>(marker::PhantomData<FnReturning<T>>);
//为Empty实现Iterator trait, Empty类型是Iterator的子类型,Empty认为是一种迭代器。
impl<T> Iterator for Empty<T> {
    type Item = T;
    fn next(&mut self) -> Option<T> {
        None
    }
}

from_fn.rs module 🔗

通过闭包来创建一个迭代器。Creates a new iterator where each iteration calls the provided closure

//example
let mut count = 6;
let counter = std::iter::from_fn(move ||{
        count += 1;
        if count < 6 {
            Some(count)
        } else {
            None
        }
    });
assert_eq!(counter.collect::<Vec<_>>(), &[1,2,3,4,5]);

FromFn迭代器的实现:

//core/src/iter/from_fn.rs
pub fn from_fn<T, F>(f: F) -> FromFn<F>
where
    F: FnMut() -> Option<T>,
{
    FromFn(f)
}
pub struct FromFn<F>(F);
// 为FromFn类型实现Iterator trait。FromFn是Iterator的子类型, FromFn是一种实现了迭代器接口的迭代器。
impl<T, F> Iterator for FromFn<F>
where
    F: FnMut() -> Option<T>,
{
    type Item = T;
    fn next(&mut self) -> Option<Self::Item> {
        (self.0)()
    }
}

from_generator.rs module 🔗

#[unstable(feature = "iter_from_generator", issue = "43122", reason = "generators are unstable")]
不稳定的功能, 在使用时add #![feature(iter_from_generator)] to the crate attributes to enablerustcE0658。

 let it = std::iter::from_generator(|| {
     yield 1;
     yield 2;
     yield 3;
 });
 let v: Vec<_> = it.collect();
 assert_eq!(v, [1, 2, 3]);

[其他] module, once_with.rs, once.rs, repeat_with.rs, repeat.rs, successors.rs 🔗

也是为一个tuple struct类型去实现Iterator, 产生一种迭代器。 所以大同、小异在迭代器next方法上
所以skip。
struct(T), T可以是函数闭包,Generator(unstable功能),

[traits] module in [iter] module 🔗

定义了迭代相关的所有行为,也就是规约了所有迭代的trait的格式

collect.rs 🔗

从具体的集合类转为Iterator。

pub trait IntoIterator {
    type Item;
    type IntoIter: Iterator<Item = Self::Item>;
    fn into_iter(self) -> Self::IntoIter;
}

从Iterator转换为具体的集合类。 例如调用collect()时。

pub trait FromIterator<A>: Sized {
    fn from_iter<T: IntoIterator<Item = A>>(iter: T) -> Self;
}

iterator.rs 🔗

pub trait Iterator {
    type Item;
    fn next(&mut self) -> Option<Self::Item>;
    // 很多已经有实现的默认方法,amount of default methods
    // map, take, filter, count ...
    ...

适配器的组合实现了链式调用。 过程如下:

[1,2,3].iter().map(|x| x * 2).filter(||).take(2).collect()<Vec<_>>;
// type array -> &[1,2,3] 
//     -> 迭代器struct Iter (在/core/src/slice/iter.rs#Iter下定义,并实现Iterator, 实现方式是macro  https://doc.rust-lang.org/stable/src/core/slice/iter.rs.html#135-145 )
// .   -> 适配器结构体Map struct(Iterator) 
//     -> 适配器结构体Filter struct (Iterator) 
//     -> 适配器结构体Take strcut(Iterator)
// -   -> Vec<_>::from_iter() (在alloc/src/vec/mod.rs中,实现方式是https://doc.rust-lang.org/stable/src/alloc/vec/mod.rs.html#2648-2650)

[adapters] module in [iter] module 🔗

map.rs module 🔗

创建一个Map结构体,实现Iterator接口。

pub struct Map<I, F> {
    pub(crate) iter: I,
    f: F,
}
impl<B, I: Iterator, F> Iterator for Map<I, F>
where
    F: FnMut(I::Item) -> B,
{
    type Item = B;
    fn next(&mut self) -> Option<B> {
        self.iter.next().map(&mut self.f)
    }
}

take.rs module, filter.rs module,类似


[ops] module 🔗

[str] module 🔗

cell.rs module 🔗

保持一个Obj的多个可变引用的容器 Shareable mutable containers.
Cell<T> , RefCell<T> 没有实现sync , 只能在单线程中

use std::cell::{RefCell, RefMut};
use std::collections::HashMap;
use std::rc::Rc;

fn main() {
    let shared_map: Rc<RefCell<_>> = Rc::new(RefCell::new(HashMap::new()));
    // Create a new block to limit the scope of the dynamic borrow
    {
        let mut map: RefMut<_> = shared_map.borrow_mut();
        map.insert("africa", 92388);
        map.insert("kyoto", 11837);
        map.insert("piccadilly", 11826);
        map.insert("marbles", 38);
    }
    let total: i32 = shared_map.borrow().values().sum();
    println!("{total}");
}

Rc<T> clone时Rc::clone(&T)持有不可变引用,想要引用计数,所以内部的引用计数用Cell包裹起来。

use std::cell::Cell;
use std::ptr::NonNull;
use std::process::abort;
use std::marker::PhantomData;

struct Rc<T: ?Sized> {
    ptr: NonNull<RcBox<T>>,
    phantom: PhantomData<RcBox<T>>,
}

struct RcBox<T: ?Sized> {
    strong: Cell<usize>,
    refcount: Cell<usize>,
    value: T,
}

impl<T: ?Sized> Clone for Rc<T> {
    fn clone(&self) -> Rc<T> {
        self.inc_strong();
        Rc {
            ptr: self.ptr,
            phantom: PhantomData,
        }
    }
}

trait RcBoxPtr<T: ?Sized> {

    fn inner(&self) -> &RcBox<T>;

    fn strong(&self) -> usize {
        self.inner().strong.get()
    }

    fn inc_strong(&self) {
        self.inner()
            .strong
            .set(self.strong()
                     .checked_add(1)
                     .unwrap_or_else(|| abort() ));
    }
}

impl<T: ?Sized> RcBoxPtr<T> for Rc<T> {
   fn inner(&self) -> &RcBox<T> {
       unsafe {
           self.ptr.as_ref()
       }
   }
}

Cell<T> 没有实现Deref trait
Cell, RefCell和一般的智能指针有区别

//core/src/cell.rs#
pub struct Cell<T: ?Sized> {
    value: UnsafeCell<T>,
}
pub struct UnsafeCell<T: ?Sized> {
    value: T,
}
UnsafeCell<T>.get() ->  *mut T 
//get the cloned value
//set根据引用地址, as 裸指针, 改内存。
impl<T: ?Sized> UnsafeCell<T> {
    pub const fn get(&self) -> *mut T {
        // 根据不可变引用 -》直接给你可变的裸指针。
        self as *const UnsafeCell<T> as *const T as *mut T
    }
}
// 运行时检查借用规则,不符合则panic
pub struct RefCell<T: ?Sized> {
    borrow: Cell<BorrowFlag>,
    value: UnsafeCell<T>,
}

Mutex<T>, RwLock<T>, atomic可以在多线程中处理

cmp.rs module 🔗

提供比较和排序的工具
为类型实现了Eq、 PartialEq 等价于 重载了 == != 操作符
为类型实现了Ord、PartialOrd 等价于 重载了 <, <=, >, >= 操作符

error.rs module and error.md 🔗

rust提供了两个错误处理方式: panic , Result<T, E>
trait std::error::Error 所有error的祖先。

marker.rs module 🔗

marker.rs定义类型的几种基本行为。

  • Copy Types whose values can be duplicated simply by copying bits。 + +
  • Send Types that can be transferred across thread boundaries。 +
  • Sized Types with a constant size known at compile time。 +
  • Sync
    • 类型的引用在多线程间可共享。
    • 定义是 当前仅当 &T: Send -> T: Sync。 也就是说&T给到线程后不存在未定义的竞争行为,标记引用传给线程后的安全性。
  • Unpin Types that can be safely moved after being pinned. +
pub fn spawn<F, T>(f: F) -> JoinHandle<T>
where
    F: FnOnce() -> T,      // 闭合
    F: Send + 'static,     // 闭包满足Send规则(closure is passed by value from the thread), 闭合的生命周期是whole 
    T: Send + 'static,     // 返回值满足Send规则(),返回值的生命周期是whole

todo going on...

option.rs module 🔗

result.rs module 🔗

pin.rs module 🔗

lib.rs module 🔗

整个core crate向外提供了哪些功能?