Indexes
Indexes are key structures that enable iteration over primary keys using value information. Let's consider a model where there are multiple tokens in the system, and each token has a unique owner. In this scenario, an owner must be associated with a token, and the tokens must be queryable by their respective owners.
struct Token { pub owner: Addr, pub ticker: String}
Tokens can be identified by an auto-incremented key, which will be used as the primary key. This ensures that each token is unique.
(TokenPK) -> Token
The index for the owner would be structured like this:
(owner, TokenPK) -> Token
TokenPK points to Token data, and the owner:TokenPK key points to a particular Token. With two database hits, the Token data can be accessed. To retrieve all the tokens managed by an owner, we can run a prefix range query as shown above.
pub const TOKENS: Map<U8Key, Token> = Map::new("tokens");// (owner Address, Token PK) -> u8 keypub const OWNER_INDEX: Map<(&Addr, U8Key), &[u8]> = Map::new("owner_tokenpk");
With the owner information as the key, tokens can be easily accessed. However, whenever the state of TOKENS changes, the owner field must also be updated accordingly.
storage-plus indexing
The aforementioned solution is functional, but suboptimal due to the high level of code complexity. This is where storage-plus/IndexedMap comes into play. IndexedMap is a storage handler that provides internal indexing. It offers two types of indexes: Unique Indexes and Multi Indexes.
Unique indexes
Ensuring the uniqueness of a data field in a database is a common requirement. UniqueIndex is an indexing helper that can help achieve this functionality."
#[derive(Serialize, Deserialize, Clone, Debug, PartialEq, JsonSchema)]pub struct Token { pub owner: Addr, pub ticker: String, pub identifier: u8, // <---- unique value}// TokenIndexes structs keeps a list of indexerspub struct TokenIndexes<'a> { // token.identifier pub identifier: UniqueIndex<'a, U8Key, Token>,}// IndexList is just boilerplate code for fetching a struct's indexesimpl<'a> IndexList<Token> for TokenIndexes<'a> { fn get_indexes(&'_ self) -> Box<dyn Iterator<Item=&'_ dyn Index<Token>> + '_> { let v: Vec<&dyn Index<Token>> = vec![&self.identifier]; Box::new(v.into_iter()) }}// tokens() is the storage access function.pub fn tokens<'a>() -> IndexedMap<'a, &'a [u8], Token, TokenIndexes<'a>> { let indexes = TokenIndexes { identifier: UniqueIndex::new(|d| U8Key::new(d.identifier), "token_identifier"), }; IndexedMap::new(TOKEN_NAMESPACE, indexes)}
Let's go over the code step by step:
#[derive(Serialize, Deserialize, Clone, Debug, PartialEq, JsonSchema)]pub struct Token { pub owner: Addr, pub ticker: String, pub identifier: u8, // <---- unique value}
A Token has several values, and among them, identifier is a unique value.
// TokenIndexes structs keeps a list of indexerspub struct TokenIndexes<'a> { // token.identifier pub identifier: UniqueIndex<'a, U8Key, Token>,}
TokenIndexes is a struct for defining indexes of Token struct.
impl<'a> IndexList<Token> for TokenIndexes<'a> { fn get_indexes(&'_ self) -> Box<dyn Iterator<Item=&'_ dyn Index<Token>> + '_> { let v: Vec<&dyn Index<Token>> = vec![&self.identifier]; Box::new(v.into_iter()) }}
IndexList is an interface for building the indexes.
pub fn tokens<'a>() -> IndexedMap<'a, &'a [u8], Token, TokenIndexes<'a>> { let indexes = TokenIndexes { identifier: UniqueIndex::new(|d| U8Key::new(d.identifier), "token_identifier"), }; IndexedMap::new(TOKEN_NAMESPACE, indexes)}
tokens() is storage function used to build IndexedMap.
identifier: UniqueIndex::new( | d| U8Key::new(d.identifier), "token_identifier"),
The above code is an index builder function. It builds composite keys with the given function, and accepts a key to identify the index bucket.
See test code below:
#[test]fn test_tokens() { let mut store = MockStorage::new(); let owner1 = Addr::unchecked("addr1"); let ticker1 = "TOKEN1".to_string(); let token1 = Token { owner: owner1.clone(), ticker: ticker1, identifier: 0, }; let token_id = increment_tokens(store.borrow_mut()).unwrap(); tokens().save(store.borrow_mut(), &U64Key::from(token_id).joined_key(), &token1).unwrap(); let ticker2 = "TOKEN2".to_string(); let token2 = Token { owner: owner1.clone(), ticker: ticker2, identifier: 0, }; let token_id = increment_tokens(store.borrow_mut()).unwrap(); // identifier clashes, must return error tokens().save(store.borrow_mut(), &U64Key::from(token_id).joined_key(), &token1).unwrap();}
The last line will crash with an error:
called `Result::unwrap()` on an **Err** value: GenericErr { msg: "Violates unique constraint on index" }
thread 'state::test::test_tokens' panicked at 'called `Result::unwrap()` on an **Err** value: GenericErr { msg: "Violates unique constraint on index" }', src/state.rs:197:90
stack backtrace:
Multi indexes
Multi indexes are used when the structure is indexed by non-unique values. Here is a case from the cw721
smart contract:
pub struct TokenIndexes<'a> { // secondary index by owner address // the last U64Key is the primary key which is an auto incremented token counter pub owner: MultiIndex<'a, (Vec<u8>, Vec<u8>), Token>,}// this may become a macro, not important just boilerplate, builds the list of indexes for later useimpl<'a> IndexList<Token> for TokenIndexes<'a> { fn get_indexes(&'_ self) -> Box<dyn Iterator<Item=&'_ dyn Index<Token>> + '_> { let v: Vec<&dyn Index<Token>> = vec![&self.owner]; Box::new(v.into_iter()) }}const TOKEN_NAMESPACE: &str = "tokens";pub fn tokens<'a>() -> IndexedMap<'a, &'a [u8], Token, TokenIndexes<'a>> { let indexes = TokenIndexes { owner: MultiIndex::new( |d, k| (index_string(d.owner.as_str()), k), TOKEN_NAMESPACE, "tokens__owner", ) }; IndexedMap::new(TOKEN_NAMESPACE, indexes)}
We see that the owner index is a MultiIndex. A multi-index can have repeated values as keys. That's why the primary key is added as the last element of the multi-index key. Like the name implies, this is an index over tokens, by owner. Given that an owner can have multiple tokens, we need a MultiIndex to be able to list / iterate over all the tokens a given owner has.
It's important to note that the key (and its components, in the case of a combined key) must implement the PrimaryKey trait. In this example, both the 2-tuple (_, _) and Vec
impl<'a, T: PrimaryKey<'a> + Prefixer<'a>, U: PrimaryKey<'a>> PrimaryKey<'a> for (T, U) { type Prefix = T; type SubPrefix = (); fn key(&self) -> Vec<&[u8]> { let mut keys = self.0.key(); keys.extend(&self.1.key()); keys }}
pub fn tokens<'a>() -> IndexedMap<'a, &'a str, TokenInfo, TokenIndexes<'a>> { let indexes = TokenIndexes { owner: MultiIndex::new( |d, k| (Vec::from(d.owner.as_ref()), k), "tokens", "tokens__owner", ), }; IndexedMap::new("tokens", indexes)}
During index creation, we must supply an index function per index.
owner: MultiIndex::new(|d, k| (Vec::from(d.owner.as_ref()), k),
The index function is responsible for taking the value and the primary key (always in Vec
Apart from the index function, we also need to provide the namespace of the primary key and the namespace for the new index.
IndexedMap::new("tokens", indexes)
Here, the namespace of the primary key must match the one used during index creation. We pass our TokenIndexes (as an IndexList-type parameter) as the second argument, thus connecting the underlying Map with the primary key and the defined indexes.
IndexedMap (and the other Indexed* types) is simply a wrapper/extension around Map that provides several index functions and namespaces for creating indexes over the original Map data. It also implements the calling of these index functions during value storage/modification/removal, so you can simply use the indexed data without worrying about implementation details.
Here is an example of how to use indexes in code:
#[test]fn test_tokens() { let mut store = MockStorage::new(); let owner1 = Addr::unchecked("addr1"); let ticker1 = "TOKEN1".to_string(); let token1 = Token { owner: owner1.clone(), ticker: ticker1, }; let ticker2 = "TOKEN2".to_string(); let token2 = Token { owner: owner1.clone(), ticker: ticker2, }; let token_id = increment_tokens(store.borrow_mut()).unwrap(); tokens().save(store.borrow_mut(), &U64Key::from(token_id).joined_key(), &token1).unwrap(); let token_id = increment_tokens(store.borrow_mut()).unwrap(); tokens().save(store.borrow_mut(), &U64Key::from(token_id).joined_key(), &token1).unwrap(); // want to load token using owner1 and ticker1 let list: Vec<_> = tokens() .idx.owner .prefix(index_string(owner1.as_str())) .range(&store, None, None, Order::Ascending) .collect::<StdResult<_>>().unwrap(); let (_, t) = &list[0]; assert_eq!(t, &token1); assert_eq!(2, list.len());}
Composite multi indexing
Let's consider the following situation: we have several batches stored by their (numeric) batch ID, which can change and must be automatically promoted after being modified. We want to process all pending batches with any status from Pending to Promoted, depending on interactions with them. Additionally, each batch has an associated expiration time. We are only interested in the pending batches that have already expired, so we can promote them. To accomplish this, we can build an index over the batches, with a composite key that includes both the batch status and its expiration timestamp. Using this composite key, we can exclude both the already promoted batches and the pending ones that have not yet expired.
To achieve this, we can build the index, generate the composite key, and iterate over all pending batches that have an expiration timestamp that is less than the current time.
Here's an example of how to do this using the Batch struct:
/// A Batch is a group of members who got voted in together. We need this to/// calculate moving from *Paid, Pending Voter* to *Voter*#[derive(Serialize, Deserialize, Clone, PartialEq, JsonSchema, Debug)]pub struct Batch { /// Timestamp (seconds) when all members are no longer pending pub grace_ends_at: u64, /// How many must still pay in their escrow before the batch is early authorized pub waiting_escrow: u32, /// All paid members promoted. We do this once when grace ends or waiting escrow hits 0. /// Store this one done so we don't loop through that anymore. pub batch_promoted: bool, /// List of all members that are part of this batch (look up ESCROWS with these keys) pub members: Vec<Addr>,}
IndexedMap definitions:
// We need a secondary index for batches, such that we can look up batches that have// not been promoted, ordered by expiration (ascending) from now.// Index: (U8Key/bool: batch_promoted, U64Key: grace_ends_at) -> U64Key: pkpub struct BatchIndexes<'a> { pub promotion_time: MultiIndex<'a, (U8Key, U64Key, U64Key), Batch>,}impl<'a> IndexList<Batch> for BatchIndexes<'a> { fn get_indexes(&'_ self) -> Box<dyn Iterator<Item=&'_ dyn Index<Batch>> + '_> { let v: Vec<&dyn Index<Batch>> = vec![&self.promotion_time]; Box::new(v.into_iter()) }}pub fn batches<'a>() -> IndexedMap<'a, U64Key, Batch, BatchIndexes<'a>> { let indexes = BatchIndexes { promotion_time: MultiIndex::new( |b: &Batch, pk: Vec<u8>| { let promoted = if b.batch_promoted { 1u8 } else { 0u8 }; (promoted.into(), b.grace_ends_at.into(), pk.into()) }, "batch", "batch__promotion", ), }; IndexedMap::new("batch", indexes)}
This example is similar to the previous one, above. The only differences are:
The composite key now has three elements: the batch status, the expiration timestamp, and the batch id (which is the
primary key for the Batch data). We're using a U64Key for the batch id / pk. This is just for convenience. We could as
well have used a plain Vec
Now, here's how to use the indexed data:
let batch_map = batches();// Limit to batches that have not yet been promoted (0), using sub_prefix.// Iterate which have expired at or less than the current time (now), using a bound.// These are all eligible for timeout-based promotionlet now = block.time.nanos() / 1_000_000_000;// as we want to keep the last item (pk) unbounded, we increment time by 1 and use exclusive (below the next tick)let max_key = (U64Key::from(now + 1), U64Key::from(0)).joined_key();let bound = Bound::Exclusive(max_key);let ready = batch_map .idx .promotion_time .sub_prefix(0u8.into()) .range(storage, None, Some(bound), Order::Ascending) .collect::<StdResult<Vec<_ >>>() ?;
A couple of comments:
- joined_key() is being used to create the range key. This helper function transforms the (partial) composite key
composed of batch expiration timestamp and batch id, into a Vec
with the proper format. That is then used to create a range bound. - sub_prefix() is used to fix the first element of the composite key (the batch status). This is required, because prefix() in this case (a 3-tuple), implies fixing the first two elements of the key, and we don't want / need that here.
- The iteration proceeds from None to the bound key created from the current timestamp. So that we effectively list only the still pending but already expired batches.
That's it. After that, we can iterate over the results and change their status from Pending to Promoted, or whatever we need to do.