ObjectRepository - .NET in-memory repository pattern for your home projects

ObjectRepository - .NET in-memory repository pattern for your home projects


Why keep all the data in memory?


To store site or backend data, the first desire of most sensible people will be a SQL database.


But sometimes it comes to mind that the data model is not suitable for SQL: for example, when building a search or a social graph, you need to search for complex relationships between objects.


The worst situation is when working in a team, and a colleague does not know how to build quick queries. How much time did you spend on solving N + 1 problems and building additional indexes so that the SELECT on the main page would work out in a reasonable time?


Another popular approach is NoSQL. A few years ago there was a big HYIP around this topic - for every opportunity MongoDB was deployed and were happy with the answers in the form of json-documents (by the way, how many crutches did you have to insert because of circular references in the documents?) .


Why not try to store all the data in the memory of the application, periodically saving in arbitrary storage (file, remote database)?


Memory has become cheap, and any possible data from most small and medium projects will fit into 1 GB of memory. (For example, my favorite home project is financial tracker , which keeps daily statistics and a history of my spending, balances, and transactions for one and a half year consumes only 45 MB of memory.)


Pros:


  • Data access becomes easier - no need to worry about requests, lazy loading, ORM features, work happens with ordinary C # objects;
  • No problems related to access from different streams;
  • Very quickly - there are no network requests, there is no translation of the code into the query language, no (de) serialization of objects is needed;
  • It’s permissible to store data in any form - even in XML on disk, even in SQL Server, even in Azure Table Storage.

Cons:


  • Horizontal scaling is lost, and as a result you cannot make zero downtime deployment;
  • If the application crashes, you can partially lose data. (But our application never crashes, right?)

How does it work?


The algorithm is as follows:


  • At the start, a connection to the data warehouse is established, and data is loaded;
  • An object model, primary indices, and relationship indices are built (1: 1, 1: Many);
  • A subscription is created for changes in the properties of objects (INotifyPropertyChanged) and for adding or removing items to the collection (INotifyCollectionChanged);
  • When a subscription is triggered, the changed object is added to the queue for writing to the data storage;
  • Periodically (by timer) changes in the repository are saved in the background thread;
  • Exiting the application also saves changes to the repository.

Sample Code


Add the necessary dependencies
 //Main library
 Install-Package OutCode.EscapeTeams.ObjectRepository
//Data storage where changes will be saved//Use the one you will use.
 Install-Package OutCode.EscapeTeams.ObjectRepository.File
 Install-Package OutCode.EscapeTeams.ObjectRepository.LiteDb
 Install-Package OutCode.EscapeTeams.ObjectRepository.AzureTableStorage
//Optionally - if you need to store the data model for the Hangfire//Install-Package OutCode.EscapeTeams.ObjectRepository.Hangfire  

Describe the data model that will be stored in the storage
  public class  ParentEntity: BaseEntity
 {
 public ParentEntity (Guid id) = & gt;  Id = id;
 }

 public class ChildEntity: BaseEntity
 {
 public ChildEntity (Guid id) = & gt;  Id = id;
 public Guid ParentId {get;  set;  }
 public string Value {get;  set;  }
 }  

Then the object model:
  public class ParentModel: ModelBase
 {
 public ParentModel (ParentEntity entity)
 {
 Entity = entity;
 }

 public ParentModel ()
 {
 Entity = new ParentEntity (Guid.NewGuid ());
 }
//Communication Example 1: Many
 public IEnumerable & lt; ChildModel & gt;  Children = & gt;  Multiple & lt; ChildModel & gt; (x = & gt; x.ParentId);

 protected override BaseEntity Entity {get;  }
 }

 public class ChildModel: ModelBase
 {
 private ChildEntity _childEntity;

 public ChildModel (ChildEntity entity)
 {
 _childEntity = entity;
 }

 public ChildModel ()
 {
 _childEntity = new ChildEntity (Guid.NewGuid ());
 }

 public Guid ParentId
 {
 get = & gt;  _childEntity.ParentId;
 set = & gt;  UpdateProperty (() = & gt; _childEntity.ParentId, value);
 }

 public string Value
 {
 get = & gt;  _childEntity.Value;
 set = & gt;  UpdateProperty (() = & gt; _childEntity.Value, value);
 }
//Access by index search
 public ParentModel Parent = & gt;  Single & lt; ParentModel & gt; (ParentId);

 protected override BaseEntity Entity = & gt;  _childEntity;
 }  

Finally, the repository class for data access itself:
   public class MyObjectRepository: ObjectRepositoryBase
 {
 public MyObjectRepository (IStorage storage): base (storage, NullLogger.Instance)
 {
 IsReadOnly = true;//For tests, allows not to save changes to the database

 AddType ((ParentEntity x) = & gt; new ParentModel (x));
 AddType ((ChildEntity x) = & gt; new ChildModel (x));
//If you are using Hangfire and you need to store the data model for the Hangfire in the ObjectRepository//this.RegisterHangfireScheme ();

 Initialize ();
 }
 }  

Create an ObjectRepository instance:


  var memory = new MemoryStream ();
 var db = new LiteDatabase (memory);
 var dbStorage = new LiteDbStorage (db);

 var repository = new MyObjectRepository (dbStorage);
 await repository.WaitForInitialize ();  

If the project will use HangFire
  public void ConfigureServices (IServiceCollection  services, ObjectRepository objectRepository)
 {
 services.AddHangfire (s = & gt; s.UseHangfireStorage (objectRepository));
 }  

Insert a new object:


  var newParent = new ParentModel ()
 repository.Add (newParent);  

With this call, the ParentModel object is added to both the local cache and the queue for writing to the database. Therefore, this operation takes O (1), and you can immediately work with this object.


For example, to find this object in the repository and make sure that the returned object is the same instance:


  var parents = repository.Set & lt; ParentModel & gt; ();
 var myParent = parents.Find (newParent.Id);
 Assert.IsTrue (ReferenceEquals (myParent, newParent));  

What happens when this happens? Set & lt; ParentModel & gt; () returns TableDictionary & lt; ParentModel & gt; , which contains ConcurrentDictionary & lt; ParentModel, ParentModel & gt; and provides additional primary and secondary functionality indexes. This allows you to have methods for searching by Id (or other arbitrary user indices) without completely iterating through all the objects.


When adding objects to the ObjectRepository , a subscription is added to modify their properties, so any change in properties also causes the addition of this object to the write queue.
Updating properties from the outside looks the same as working with a POCO object:


  myParent.Children.First (). Property = "Updated value";  

You can delete an object in the following ways:


  repository.Remove (myParent);
 repository.RemoveRange (otherParents);
 repository.Remove & lt; ParentModel & gt; (x = & gt;! x.Children.Any ());  

This also adds the object to the queue for deletion.


How does preservation work?


ObjectRepository when changing monitored objects (both adding or deleting, and changing properties) causes the event of ModelChanged , to which IStorage is subscribed. Implementations of IStorage when an event occurs ModelChanged add changes to 3 queues — add, update, and delete.


Also, implementations of IStorage during initialization create a timer that causes changes to be saved every 5 seconds.


In addition, there is an API for pushing the save: ObjectRepository.Save () .


Before each save, it first removes meaningless operations from the queues (for example, duplicate events - when the object was changed twice or quickly added/deleted objects), and only then the saving itself.


In all cases, the actual object is kept entirely, so it is possible that the objects are saved in a different order than they changed, including more recent versions of objects than at the time of adding to the queue.


What else is there?


  • All libraries are based on .NET Standard 2.0. You can use it in any modern .NET project.
  • API is thread safe. Internal collections are implemented on the basis of ConcurrentDictionary , event handlers have either locks or do not need them.
    The only thing worth remembering is to call ObjectRepository.Save ();
  • when the application terminates.
  • Arbitrary indices (require uniqueness):

  repository.Set & lt; ChildModel & gt; (). AddIndex (x = & gt; x.Value);
 repository.Set & lt; ChildModel & gt; (). Find (x = & gt; x.Value, "myValue");  

Who uses it?


Personally, I began to use this approach in all hobby projects, because it is convenient and does not require large expenditures on writing a data access layer or deploying heavy infrastructure. Personally, I usually have enough data storage in litedb or in a file.


But in the past, when the now defunct startup EscapeTeams was done with the team ( they thought, money - but no, again experience ) - they used Azure Table Storage to store data.


Plans for the future


I want to fix one of the main drawbacks of this approach - horizontal scaling. To do this, you need either distributed transactions (sic!), Or make a volitional decision that the same data from different instances should not change, or let them change according to the principle "who is the last is right."


From a technical point of view, I see the following scheme possible:


  • Store instead of the EventLog and Snapshot object model
  • Find other instances (add endpoints of all? udp discovery? master/slave?) to the settings
  • Replicate between EventLog instances through any of the consensus algorithms, such as RAFT.

Also, there is another problem that bothers me - it is a cascading deletion, or the detection of cases of deletion of objects that are referenced from other objects.


Source Code


If you have read to here, then only the code remains to be read, it can be
found on GitHub .

Source text: ObjectRepository - .NET in-memory repository pattern for your home projects