Software Design Blog

Simple solutions to solve complex problems

Push queue processing boundaries with 3x greater throughput

Broadcasting Messages

The problem is how can we measure the queue processing throughput? Or in a production environment, how can we add a performance counter without modifying our existing infrastructure?

The solution is to add a decorator pattern that will act as a proxy between the QueueMonitor class and the actual command processing class to intercept the communication and add instrumentation.

The previous post used the competing consumer pattern to process a queue concurrently. The message queue strategies post used a single processor to process a queue.

In this post, we will add instrumentation to the single queue processor and the competing consumer queue processor to compare performance.

Download Source Code

Setup

The experiment will add 512KB messages to a queue to simulate a decent workload. The intention is to read the messages from the queue and call a File SMTP mailer server that will write the emails to disk as fast as possible.

The core components such as the Queue Monitor were covered in the Message Queue Delivery Strategies post. The queue monitor is responsible for reading messages from the queue and pushing the message content to a command to execute.

The classes that will be used in this experiment are listed below.

  
    public class FileSmtpMailer : IMailer
    {
        private readonly string _path;

        public FileSmtpMailer(string path)
        {
            if (path == null) throw new ArgumentNullException("path");
            _path = path;
        }

        public void Send(string message, string sender, string recipients, string subject)
        {
            using (var client = new SmtpClient())
            {
                // This can be configured in the app.config
                client.DeliveryMethod = SmtpDeliveryMethod.SpecifiedPickupDirectory;
                client.PickupDirectoryLocation = _path;

                using (var mailMessage = new MailMessage(sender, recipients, 
                                                         subject, message))
                {
                    mailMessage.IsBodyHtml = true;
                    client.Send(mailMessage);
                }
            }
        }
    }

    public class MailProcessorCommand : ICommand<OrderMessage>
    {
        private readonly IMailer _mailer;

        public MailProcessorCommand(IMailer mailer)
        {
            if (mailer == null) throw new ArgumentNullException("mailer");
            _mailer = mailer;
        }

        public void Execute(OrderMessage message)
        {
            if (message == null) throw new ArgumentNullException("message");
            _mailer.Send(message.Body, message.Sender, 
                         message.Recipients, message.Subject);
        }
    }

Instrumentation

The TimerCommandDecorator decorator will be placed in between the QueueMonitor and the MailProcessorComamnd in order to measure the performance throughput.

  
    public class TimerCommandDecorator<T> : ICommand<T>
    {
        private readonly ICommand<T> _next;
        private int _messages;
        private Stopwatch _stopwatch;

        public TimerCommandDecorator(ICommand<T> next)
        {
            if (next == null) throw new ArgumentNullException("next");
            _next = next;
        }

        public void Execute(T message)
        {
            if (_stopwatch == null) _stopwatch = Stopwatch.StartNew();

            _next.Execute(message);

            _messages += 1;
            Console.WriteLine("Processing {0} messages took {1} sec", 
                              _messages, _stopwatch.Elapsed.TotalSeconds);
        }
    }

The TimerCommandDecorator implementation can easily be swapped out with a perfmon version that will increment the performance counter for each message.

Single vs Competing Consumers

Let's run the experiment by observing the throughput difference between the single processor and competing consumer receivers by processing 1000 messages.

The single processor version:

  
            using (var queue = new MessageQueue(QueuePath))
            {
                queue.Formatter = new BinaryMessageFormatter();

                var message = new OrderMessage()
                {
                    Recipients = "recipient@mail.com",
                    Sender = "admin@mail.com",
                    Subject = "Hello World",
                    Body = emailBody // loaded from a file
                };

                // Send messages to the queue
                for (var i = 0; i < 1000; i++)
                {
                    queue.Send(message);
                }               

                var fileMailer = new FileSmtpMailer(@"C:\temp");
                var command = new MailProcessorCommand(fileMailer);
                var timerCommand = new TimerCommandDecorator<OrderMessage>(command);
                var queueMonitor = new QueueMonitor<OrderMessage>(queue, timerCommand);
                queueMonitor.Start();
            }
Processing 1000 messages took 47.3004132 sec

The competing consumer processor version with 10 processors:

  
                // Create 10 processors
                var commands = new List<ICommand<OrderMessage>>();
                for (var i = 0; i < 10; i++)
                {
                    commands.Add(new MailProcessorCommand(fileMailer));
                }

                var command = new CompositeCompetingCommandProcessor<OrderMessage>(commands);
                var timerCommand = new TimerCommandDecorator<OrderMessage>(command);
                var queueMonitor = new QueueMonitor<OrderMessage>(queue, timerCommand);
                queueMonitor.Start();
Processing 1000 messages took 16.241723 sec

We could experiment using various numbers of concurrent receivers to determine how performance will be affected.

The competing consumer pattern is a great option to fully utilize resources but it may have a negative impact on other tenants of the system. For example, if we call an external API with too many requests, it may appear like a denial-of-service attack.

Summary

The advantage of the competing consumers pattern is the flexibility to configure the number of concurrent consumers to produce a balanced outcome. It is however difficult to determine the throughput and performance impact without instrumentation in place to measure resource utilization.

An approach was covered to add performance instrumentation. The performance metrics revealed that the queue processing throughput can be improved significantly using the competing consumer queue processor compared to the single queue processor.

Dispatching messages with the Competing Consumers Pattern

Broadcasting Messages

The previous post focused on fanning out the same queued message to multiple consumers.

The problem is that a single worker is great at processing a single message at a time but how do we increase the throughput by processing messages faster?

The solution is to use the Competing Consumers Pattern, which enables multiple concurrent workers to process messages received in the same queue. This approach will optimise throughput and balance the workload.

For example, travellers often queue at the airport check-in. Checking in would be a lengthy process if there was a long queue but only one check-in operator. The solution is to employee multiple operators to service the same queue to process it faster.

Download Source Code

Setup

The core components such as the Queue Monitor were covered in the Message Queue Delivery Strategies post. The queue monitor is responsible for reading messages from the queue and pushing the message content to a command to execute.

The OrderProcessorCommand class used in the previous post was modified on line 19 below to simulate each worker taking a different duration to process a request. For example, worker 1 will take 1 second to process a request whereas worker 2 will take 2 seconds to process a request.

  
    public interface ICommand<in T>
    {
        void Execute(T message);
    }

    public class OrderProcessorCommand : ICommand<OrderModel>
    {
        private readonly int _id;

        public OrderProcessorCommand(int id)
        {
            _id = id;
        }

        public void Execute(OrderModel message)
        {
            if (message == null) throw new ArgumentNullException("message");
            var start = DateTime.Now;
            Thread.Sleep(TimeSpan.FromSeconds(_id));
            Console.WriteLine("Processed {0} by worker {1} from {2} to {3}", 
                               message.Name, _id, start.ToString("h:mm:ss"),
                               DateTime.Now.ToString("h:mm:ss"));
        }
    }

Competing Consumers

The pool of available commands are orchestrated using the CompositeCompetingCommandProcessor class.

The class will automatically block the queue monitor on line 33 from retrieving and pushing in the next message when all of the command processors are busy. The queue monitor will be unblocked as soon as the first command processor becomes available.

The WorkingCommandModel is used to associate a command with a task.

  
    public class WorkingCommandModel<T>
    {
        public ICommand<T> Command { get; set; }
        public Task Task { get; set; }
    }

    public class CompositeCompetingCommandProcessor<T> : ICommand<T>
    {
        private readonly IEnumerable<WorkingCommandModel<T>> _workers;

        public CompositeCompetingCommandProcessor(IEnumerable<ICommand<T>> commands)
        {
            if (commands == null) throw new ArgumentNullException("commands");
            _workers = commands.Select(
                          c => new WorkingCommandModel<T> { Command = c }).ToList();
        }

        public void Execute(T message)
        {
            WorkingCommandModel<T> worker = null;

            while (worker == null)
            {
                worker = GetAvailableWorker();
                if (worker == null) WaitForWorkerAvailability();
            }

            worker.Task = Task.Factory.StartNew(() => 
                                         worker.Command.Execute(message), 
                                         TaskCreationOptions.LongRunning);

            // Block new messages from arriving until a worker is available
            if (GetAvailableWorker() == null) WaitForWorkerAvailability();
        }

        private void WaitForWorkerAvailability()
        {
            var tasks = (from w in _workers where w.Task != null select w.Task);
            Task.WaitAny(tasks.ToArray());
        }

        private WorkingCommandModel<T> GetAvailableWorker()
        {
            var worker = _workers.FirstOrDefault(w => 
                                       w.Task == null || w.Task.IsCompleted);
            if (worker != null && worker.Task != null) worker.Task.Dispose();
            return worker;
        }
    }
Let's run the solution using 3 workers to process 9 unique messages in the queue.
  
            using (var queue = new MessageQueue(QueuePath))
            {
                queue.Formatter = new BinaryMessageFormatter();

                // Write 9 messages to the queue
                for (var orderId = 1; orderId <= 9; orderId++)
                {
                    var order = new OrderModel()
                    {
                        Id = orderId,
                        Name = string.Format("Order {0}", orderId)
                    };
                    queue.Send(order);
                }
                
                // Create 3 workers
                var workers = new List<ICommand<OrderModel>>();
                for (var workerId = 1; workerId <= 3; workerId++)
                {
                    workers.Add(new OrderProcessorCommand(workerId));
                }

                // Process the queue
                var command = new CompositeCompetingCommandProcessor<OrderModel>(workers);
                var queueMonitor = new QueueMonitor<OrderModel>(queue, command);
                queueMonitor.Start();
            }
Processed Order 1 by worker 1 from 6:42:48 to 6:42:49
Processed Order 2 by worker 2 from 6:42:48 to 6:42:50
Processed Order 4 by worker 1 from 6:42:49 to 6:42:50
Processed Order 3 by worker 3 from 6:42:48 to 6:42:51
Processed Order 6 by worker 1 from 6:42:50 to 6:42:51
Processed Order 5 by worker 2 from 6:42:50 to 6:42:52
Processed Order 8 by worker 1 from 6:42:51 to 6:42:52
Processed Order 7 by worker 3 from 6:42:51 to 6:42:54
Processed Order 9 by worker 2 from 6:42:52 to 6:42:54

As we can see from the results above, worker 1 processed 4 messages whereas worker 3 only processed 2 messages.

The order can sometimes be important. For example, a reservation system may processes new bookings, updates and cancellations. Let's say the traveller creates a booking and decides to cancel it immediately. When both requests are in a queue processed by multiple workers then there is a potential that the cancellation will be processed before the create.

Summary

A practical example was provided to process messages from a single queue using multiple workers by applying the competing consumers pattern.

The drawback of potentially processing messages out of order was briefly covered. With enough interest, I may create a follow up post to address the order issue using various techniques such as a sequential message convoy, applying sessions and using correlation IDs to preserve delivery order.

The next post will provide an illustration for adding instrumentation to monitor queue processing throughput. The performance of the competing consumers solution is compared with the single worker solution.

Broadcasting messages using the composite pattern

Broadcasting Messages

The previous post focused on reading messages from a queue and pushing the messages to a single command for processing.

The problem is how can we notify multiple commands to process the same request? For example, when an order is received, we may want a command to 1) send an email, 2) add an audit record and 3) call an external API.

The solution is to use the composite pattern, whereby a group of commands are treated in a similar way as a single command to fan out the messages.

Download Source Code

Setup

The core components such as the Queue Monitor were covered in the Message Queue Delivery Strategies post.

Sequential Processing

The CompositeSequentialBroadcastCommand class below implements ICommand to indicate that it exhibits the behaviour of a command. The responsibility of the composite class is to loop though a collection of commands and call each command with the same request.

  
    public interface ICommand<in T>
    {
        void Execute(T message);
    }

    public class OrderProcessorCommand : ICommand<OrderModel>
    {
        private readonly int _id;

        public OrderProcessorCommand(int id)
        {
            _id = id;
        }

        public void Execute(OrderModel message)
        {
            if (message == null) throw new ArgumentNullException("message");
            var start = DateTime.Now;
            Thread.Sleep(TimeSpan.FromSeconds(2));
            Console.WriteLine("Processed {0} by worker {1} from {2} to {3}", 
                               message.Name, _id, start.ToString("h:mm:ss"), 
                               DateTime.Now.ToString("h:mm:ss"));
        }
    }

    public class CompositeSequentialBroadcastCommand<T> : ICommand<T>
    {
        private readonly IEnumerable<ICommand<T>> _commands;

        public CompositeSequentialBroadcastCommand(IEnumerable<ICommand<T>> commands)
        {
            if (commands == null) throw new ArgumentNullException("commands");
            _commands = commands.ToList();
        }

        public void Execute(T message)
        {
            foreach (var command in _commands)
            {
                command.Execute(message);
            }
        }
    }
Let's see what happens when 3 sequential workers are configured to process 2 messages.
  
            using (var queue = new MessageQueue(@".\Private$\Orders"))
            {
                queue.Formatter = new BinaryMessageFormatter();

                var workers = new List<ICommand<OrderModel>>();
                for (var workerId = 1; workerId <= 3; workerId++)
                {
                    workers.Add(new OrderProcessorCommand(workerId));
                }
                
                var command = new CompositeSequentialBroadcastCommand<OrderModel>(workers);
                var queueMonitor = new QueueMonitor<OrderModel>(queue, command);
                queueMonitor.Start();
            }
Processed Order 1 by worker 1 from 6:26:34 to 6:26:36
Processed Order 1 by worker 2 from 6:26:36 to 6:26:38
Processed Order 1 by worker 3 from 6:26:38 to 6:26:40
Processed Order 2 by worker 1 from 6:26:40 to 6:26:42
Processed Order 2 by worker 2 from 6:26:42 to 6:26:44
Processed Order 2 by worker 3 from 6:26:44 to 6:26:46

The output above shows that each worker processed the same message in sequence. If worker 2 failed then worker 3 would not be executed. This can be a great pattern when the sequence is important. For example, only send the welcome email if the order has been created successfully.

Parallel Processing

The CompositeParallelBroadcastCommand class is similar to the previous example. The only difference is that this class will call the commands in parallel.

  
    class CompositeParallelBroadcastCommand<T> : ICommand<T>
    {
        private readonly IEnumerable<ICommand<T>> _commands;

        public CompositeParallelBroadcastCommand(IEnumerable<ICommand<T>> commands)
        {
            if (commands == null) throw new ArgumentNullException("commands");
            _commands = commands.ToList();
        }

        public void Execute(T message)
        {
            Parallel.ForEach(_commands, c => c.Execute(message));
        }
    }
Let's see what happens when parallel 3 workers are configured to process 2 messages.
  
            using (var queue = new MessageQueue(@".\Private$\Orders"))
            {
                queue.Formatter = new BinaryMessageFormatter();

                var workers = new List<ICommand<OrderModel>>();
                for (var workerId = 1; workerId <= 3; workerId++)
                {
                    workers.Add(new OrderProcessorCommand(workerId));
                }
                
                var command = new CompositeParallelBroadcastCommand<OrderModel>(workers);
                var queueMonitor = new QueueMonitor<OrderModel>(queue, command);
                queueMonitor.Start();
            }
Processed Order 1 by worker 2 from 6:08:36 to 6:08:38
Processed Order 1 by worker 1 from 6:08:36 to 6:08:38
Processed Order 1 by worker 3 from 6:08:36 to 6:08:38
Processed Order 2 by worker 3 from 6:08:38 to 6:08:40
Processed Order 2 by worker 2 from 6:08:38 to 6:08:40
Processed Order 2 by worker 1 from 6:08:38 to 6:08:40

The output above shows that each worker processed the same message concurrently. This can be a great pattern when sequencing doesn’t matters. For example, send a welcome email and add an audit record at the same time.

Summary

This post focused on broadcasting a single message to multiple receivers. The composite pattern allows the behaviour of the Queue Monitor push to change without modifying the code.

Coming soon: The next post will focus on processing a single request by dispatching messaging to a pool of competing consumer processors.

Message Queue Delivery Strategies

Message Queue Delivery Strategies

The previous post focused on MSMQ fundamentals using a pull approach to retrieve messages from a queue.

This post is the start of a series that covers multiple strategies to push queued messages to clients. The intention of the push approach is to keep clients agnostic of being part of a message based architecture.

MSMQ technology is used but it is easy enough to change the implementation to use an alternative queuing solution such as Azure Service Bus.

Download Source Code

Setup

Setting up an MSQM was covered in the MSMQ fundamentals post.

The following code was used for placing 3 unique OrderModel messages in the queue.

  
    [Serializable]
    public class OrderModel
    {
        public int Id { get; set; }
        public string Name { get; set; }
    }

    using (var queue = new MessageQueue(@".\Private$\Orders"))
    {
        queue.Formatter = new BinaryMessageFormatter();

        for (var orderId = 1; orderId <= 3; orderId++)
        {
            var order = new OrderModel()
            {
                Id = orderId,
                Name = string.Format("Order {0}", orderId)
            };
            queue.Send(order);
        }
    }

Command Pattern

The problem is how can a message infrastructure issue requests to objects without knowing anything about the operation being requested or the receiver of the request?

The solution is to use the command pattern to decouple senders from receivers. A command decouples the object that invokes the operation from the object that knows how to perform the operation.

The generic command interface is displayed below. The OrderProcessorCommand class implements the command interface to indicate that it can accept OrderModel messages, which it will use to simulate an order being processed.

  
    public interface ICommand<in T>
    {
        void Execute(T message);
    }

    public class OrderProcessorCommand : ICommand<OrderModel>
    {
        private readonly int _id;

        public OrderProcessorCommand(int id)
        {
            _id = id;
        }

        public void Execute(OrderModel message)
        {
            if (message == null) throw new ArgumentNullException("message");
            var start = DateTime.Now;

            // Simulate work being performed
            Thread.Sleep(TimeSpan.FromSeconds(2));

            Console.WriteLine("Processed {0} by worker {1} from {2} to {3}", 
                 message.Name, _id, start.ToString("h:mm:ss"), 
                 DateTime.Now.ToString("h:mm:ss"));
        }
    }
Note that a sleep was added on line 21 to simulate work being performed by the processor.

Queue Monitor

The queue monitor class acts as orchestrator, which is responsible for listening to the queue for incoming messages and calling a command to execute each message.

When the client calls the start method then the workflow outlined below will run consciously until the client calls the stop method:

  1. The BeginReceive method will kick off the queue listing operation.
  2. The OnReceiveComplete event will be raised when a message arrives.
  3. The command will be executed by passing in the message content.
  
    public interface IQueueMonitor : IDisposable
    {
        void Start();
        void Stop();
    }

    public class QueueMonitor<T> : IQueueMonitor
    {
        private readonly MessageQueue _queue;
        private readonly ICommand<T> _command;
        private bool _continue = true;

        public QueueMonitor(MessageQueue queue, ICommand<T> command)
        {
            if (queue == null) throw new ArgumentNullException("queue");
            if (command == null) throw new ArgumentNullException("command");
            _queue = queue;
            _command = command;

            _queue.ReceiveCompleted += OnReceiveCompleted;
        }

        private void OnReceiveCompleted(object sender, 
                             ReceiveCompletedEventArgs receiveCompletedEventArgs)
        {
            var message = _queue.EndReceive(receiveCompletedEventArgs.AsyncResult);
            _command.Execute((T)message.Body);
            if (_continue) _queue.BeginReceive();
        }

        public void Start()
        {
            _continue = true;
            _queue.BeginReceive();
        }

        public void Stop()
        {
            _continue = false;
        }

        public void Dispose()
        {
            Dispose(true);
            GC.SuppressFinalize(this);
        }

        protected virtual void Dispose(bool isDisposing)
        {
            if (!isDisposing || _queue == null) return;
            _queue.ReceiveCompleted -= OnReceiveCompleted;
            _queue.Dispose();
        }
    }

Single Receiver

Let's see the gears ticking over by processing the messages on the queue.

  
            using (var queue = new MessageQueue(@".\Private$\Orders"))
            {
                var command = new OrderProcessorCommand(1);
                var queueMonitor = new QueueMonitor<OrderModel>(queue, command);
                queueMonitor.Start();
            }
Processed Order 1 by worker 1 from 6:20:11 to 6:20:13
Processed Order 2 by worker 1 from 6:20:13 to 6:20:15
Processed Order 3 by worker 1 from 6:20:15 to 6:20:17 

The output above shows that the single receiver processed the messages in order, one at a time.

The drawback of the single receiver is the finite amount of throughput due to the constraint of processing one message at time.

Summary

This post demonstrated a generic approach to continually monitor a queue for new messages and pushing the message content to a command to execute.

The next post will describe how to broadcast a single message across multiple processors.

How to get started with Microsoft Message Queuing MSMQ

MSQM Fundamentals

The previous post described how to design a highly scalable solution using queue oriented architecture.

This post will cover the fundamentals to get started with Microsoft Message Queuing (MSMQ). Code examples are provided to illustrate how to create a queue, write messages to a queue and read messages from a queue synchronously and asynchronously.

Download Source Code

Setup

The pre-requisite is to install MSMQ, which comes free with Windows.

Creating a queue

The following code was used for creating the queue named orders.

  
using System.Messaging;

if (!MessageQueue.Exists(@".\Private$\Orders"))
{
    MessageQueue.Create(@".\Private$\Orders");
} 

The orders queue and messages on the queue can be viewed using Computer Management as shown below.

MSQM Computer Management

Writing Messages

The following code was used for writing the OrderModel DTO instance (message) to the queue using a BinaryMessageFormatter.

  
    [Serializable]
    public class OrderModel
    {
        public int Id { get; set; }
        public string Name { get; set; }
    }

    using (var queue = new MessageQueue(@".\Private$\Orders"))
    {
        queue.Formatter = new BinaryMessageFormatter();

        var order = new OrderModel()
        {
            Id = 123,
            Name = "Demo Order"
        };
        queue.Send(order);
    }

Reading Messages

Blocking Synchronous Read

The following code was used for reading a message from the queue. The thread will be blocked on the receive method on line 5 until a message is available.
  
    using (var queue = new MessageQueue(@".\Private$\Orders"))
    {
        queue.Formatter = new BinaryMessageFormatter();

        var message = queue.Receive();
        var order = (OrderModel)message.Body;
    }

A read timeout duration can be specified. The code below will use a 1 sec timeout limit and catch the MessageQueueException raised due to the timeout. A custom ReadTimeoutException is thrown to notify the client that a timeout has occured.

  
    public class ReadTimeoutException : Exception
    {
        public ReadTimeoutException(string message, Exception innerException) 
               : base(message, innerException)
        {
            
        }
    }

    using (var queue = new MessageQueue(@".\Private$\Orders"))
    {
        queue.Formatter = new BinaryMessageFormatter();
        Message message = null;

        try
        {
            message = queue.Receive(TimeSpan.FromSeconds(1));
        }
        catch (MessageQueueException mqException)
        {
           if (mqException.MessageQueueErrorCode == MessageQueueErrorCode.IOTimeout)
           {
               throw new ReadTimeoutException("Reading from the queue timed out.", 
                            mqException);
           }
           throw;
        }
    }

Asynchronous Read

The following code was used to perform an async read in conjunction with a Task.
  
private async Task<OrderModel> ReadAsync()
{
    using (var queue = new MessageQueue(@".\Private$\Orders"))
    {
        queue.Formatter = new BinaryMessageFormatter();

        var message = await Task.Factory.FromAsync<Message>(
                              queue.BeginReceive(), queue.EndReceive);
        return (OrderModel)message.Body;
    }
}

Summary

This post covered the fundamentals of using a MSMQ to read and write messages.

The next post describes various message delivery strategies.

Design a highly scalable publish-subscribe solution

Design Scalable Solutions

Message oriented architecture is a great option to produce highly scalable, extendible and loosely coupled solutions. The message architecture describes the principal of using a software system that can send and receive (usually asynchronously) messages to one or more queues, so that services can interact without needing to know specific details about each other.

The problem is how can an application in an integrated architecture communicate with other applications that are interested in receiving messages without knowing the identities of the receivers?

The solution is to extend the communication infrastructure by creating topics or by dynamically inspecting message content. This allows applications to subscribe to specific messages.

This post will provide an illustration to turn a monolithic application into a distributed, highly scalable solution.

The Monolith

The conceptual model of a monolithic application is displayed below.

Monolithic Application

The application above consists of a collection of tightly coupled components. The side-effects are:

  • Cumbersome release management - the entire application must be upgraded in order to release a new feature or fix a bug in any of the components.
  • Low performance monitoring - isolating and measuring component throughput is difficult when all of the components are competing for the same resources.
  • Poor scalability - dedicated resources cannot be allocated to individual components.
  • High security risk - an attacker can gain access to all of the external dependencies once the application has been compromised.

Publish-Subscribe Solution

The conceptual model of a loosely coupled, Public-Subscribe (Pub-Sub) solution is displayed below.

Monolithic Application
The solution above consists of a queue based messaging system to decouple the components. The advantages are:
  • Resource Isolation - each component can be hosted on a dedicated or shared environment.
  • Decoupling - the online store's only responsibility is to write a message to a queue when an event occurs and therefore doesn't need to know who the subscribers are.
  • Extendible - subscribers can be added or removed.
  • Robust deployment - a single component can be upgraded without impacting others.
  • Secure - an attacker needs to compromise all of the isolated components to gain access to the entire ecosystem.
  • Testable - it is a lot easier to test component behaviour in isolation when the input and execution paths are limited.

Recovering from failure

The image below depicts the mail server being unavailable temporarily.

Queued Messages during temporary outages

Recovering from temporary outages is easy since messages will continue to queue up.

Messages will not be lost during an outage. Dequeuing will occur once the service becomes available again.

Monitoring the queue length provides insight into the overall health of the system. The diagnosis of a high queue length is typically:

  • An indication that the subscriber service is down.
  • The subscriber is over-utilized and does not have the capacity to service the incoming messages fast enough.

Peak Load can be handled gracefully without stressing out resources since the email notification component will continually dequeue and process messages, one after the other, regardless of the number of messages in the queue. The email notification component will eventually catch up during periods of lower than normal use.

Load Distribution and Redundancy

The image below depicts distributing the mail server role across multiple workers.

Load Distribution and Redundancy

Workload distribution can be achieved by deploying additional subscribers to dequeue and process messages from the same queue.

High availability, scalability and redundancy capabilities are provided when multiple subscribers are deployed to service the same queue. The system will remain online during maintenance by upgrading one subscriber at a time.

Performance benchmarking can be achieved by monitoring the queue length. Throughput can be predicated using the follow approach:

  1. Stopping the component server
  2. Filling the queue up with a large quantity of messages
  3. Starting the component server
  4. Measuring the number of messages that are processed per minute by monitoring the queue length

Lower latency can be achieved with geo-distributed deployments. For example, the Loyalty system, which is responsible for communicating with an external API, can be hosted in the same region as the API.

Summary

This post provided a laundry list of benefits for designing applications using a message based architectural pattern compared to traditional highly coupled monolithic applications. The Pub-Sub architecture is a proven and reliable approach to produce highly scalable solutions.

How to make your application 5x faster with background processing

5X Performance Gain

Processing everything in real-time is easy but often makes an application tightly coupled, prone to failure and difficult to scale.

Let’s assume a user just clicked on the purchase button to kick off the following workflow:

  1. Save the order
  2. Send a confirmation email (mask the credit card number)
  3. Update the user’s loyalty points in an external system

The problem is that non-critical workloads (step 2 and 3) can negatively impact an application's performance and recoverability. What happens when the mail server is down and the email confirmation fails? How do we monitor failure? Do we roll the transaction back or replay the workflow by taking a checkpoint after each successful step?

The solution is to ensure that application flow isn't impeded by waiting on non-critical workloads to complete. Queue based systems are effective at solving this problem.

This post will use a purchase order application that accepts XML requests. The application will sanitise the credit card details by masking most of the digits before sending the request to a mailbox.

Download Source Code

Setup

The following XML document will be used as the purchase order. It is about 512KB to simulate a decent payload.

  


  
    Test User
    
123 Abc Road, Sydney, Australia
Visa 4111111111111111
Gambardella, Matthew XML Developer's Guide ...
The PurchaseOrderProcessor class was intentionally kept small to focus on solving the main problem, which is to minimise the performance impact of the mailing system.
  
    public interface IMailer
    {
        void Send(string message);
    }

    public class PurchaseOrderProcessor
    {
        private readonly IMailer _mailer;

        public PurchaseOrderProcessor(IMailer mailer)
        {
            if (mailer == null) throw new ArgumentNullException("mailer");
            _mailer = mailer;
        }

        public void Process(string xmlRequest)
        {
            var doc = new XmlDocument();
            doc.LoadXml(xmlRequest);

            // Process the order

            _mailer.Send(xmlRequest);
        }
    }
The SMTP mailer will be configured to write the mail to disk to simplify the illustration.
  
    public class FileSmtpMailer : IMailer
    {
        private readonly string _path;
        private readonly string _from;
        private readonly string _recipients;
        private readonly string _subject;

        public FileSmtpMailer(string path, string from, string recipients, string subject)
        {
            if (path == null) throw new ArgumentNullException("path");
            _path = path;
            _from = @from;
            _recipients = recipients;
            _subject = subject;
        }

        public void Send(string message)
        {
            using (var client = new SmtpClient())
            {
                // This can be configured in the app.config
                client.DeliveryMethod = SmtpDeliveryMethod.SpecifiedPickupDirectory;
                client.PickupDirectoryLocation = _path;

                using (var mailMessage = 
                          new MailMessage(_from, _recipients, _subject, message))
                {
                    client.IsBodyHtml = true;
                    client.Send(mailMessage);    
                }             
            }
        }
    }
The MaskedMailerDecorator will be used for masking the credit card details.
  
    public class MaskedMailerDecorator : IMailer
    {
        private readonly Regex _validationRegex;
        private readonly IMailer _next;
        private const char MaskCharacter = '*';
        private const int MaskDigits = 4;

        public MaskedMailerDecorator(Regex regex, IMailer next)
        {
            if (regex == null) throw new ArgumentNullException("regex");
            if (next == null) throw new ArgumentNullException("next");
            _validationRegex = regex;
            _next = next;
        }

        public void Send(string message)
        {
            if (_validationRegex.IsMatch(message))
            {
                message = _validationRegex.Replace(message, 
                               match => MaskNumber(match.Value));
            }
            _next.Send(message);
        }

        private static string MaskNumber(string value)
        {
            return value.Length <= MaskDigits ?
               new string(MaskCharacter, value.Length) :
               string.Format("{0}{1}", 
                          new string(MaskCharacter, value.Length - MaskDigits),
                          value.Substring(value.Length - MaskDigits, MaskDigits));
        }
    }

Baseline

Let's establish a performance baseline by running the application with the Null Mailer that doesn't do anything. Refer to the reject the null checked object post if you are new to the null object pattern.

  
    public class NullMailer : IMailer
    {
        public void Send(string message)
        {
            // intentionally blank
        }
    }

    static void Main(string[] args)
    {
        var path = Path.Combine(Directory.GetCurrentDirectory(), "PurchaseOrder.xml");
        var request = File.ReadAllText(path); 

        var nullMailer = new NullMailer();
        var orderProcessor = new PurchaseOrderProcessor(nullMailer);

        var stopWatch = Stopwatch.StartNew();
        Parallel.For(0, 1000, i => orderProcessor.Process(request));
        stopWatch.Stop();

        Console.WriteLine("Seconds: {0}", stopWatch.Elapsed.TotalSeconds);
        Console.ReadLine();
    }
Seconds: 6.3404086

Real-Time Processing

Let’s measure the performance when MaskedMailerDecorator and FileSmtpMailer are used.
  
            Directory.CreateDirectory(@"C:\temp");
            var ccRegEx = new Regex(@"(?:\b4[0-9]{12}(?:[0-9]{3})?\b
                                       |\b5[1-5][0-9]{14}\b)", RegexOptions.Compiled);

            var path = Path.Combine(Directory.GetCurrentDirectory(), "PurchaseOrder.xml");
            var request = File.ReadAllText(path);
            
            // Use Unity for doing the wiring          
            var fileMailer = new FileSmtpMailer(@"C:\temp", "p@mail.com", "a@mail.com", "Order");
            var maskedMailer = new MaskedMailerDecorator(ccRegEx, fileMailer);
            var orderProcessor = new PurchaseOrderProcessor(maskedMailer);

            var stopWatch = Stopwatch.StartNew();
            Parallel.For(0, 1000, i => orderProcessor.Process(request));
            stopWatch.Stop();

            Console.WriteLine("Seconds: {0}", stopWatch.Elapsed.TotalSeconds);
            Console.ReadLine();
Seconds: 32.0430142

Background Processing

Let's extend the solution by adding a memory queue to buffer the results. The queue effectively acts as an outbox for sending mail without overwhelming the mail server with parallel requests.

    public class QueuedMailerDecorator : IMailer
    {
        private readonly IMailer _next;
        private BlockingCollection<string> _messages;

        public QueuedMailerDecorator(IMailer next)
        {
            if (next == null) throw new ArgumentNullException("next");
            _next = next;
            _messages = new BlockingCollection<string>();

            Task.Factory.StartNew(() =>
            {
                try
                {
                    // Block the thread until a message becomes available
                    foreach (var message in _messages.GetConsumingEnumerable())
                    {
                        _next.Send(message);
                    }
                }
                finally
                {
                    _messages.Dispose();
                }
            }, TaskCreationOptions.LongRunning);
        }

        public void Send(string message)
        {
            if (_messages == null || _messages.IsAddingCompleted)
            {
                return;
            }
            try
            {
                _messages.TryAdd(message);
            }
            catch (ObjectDisposedException)
            {
                Trace.WriteLine("Add failed since the queue was disposed.");
            }
        }

        public void Dispose()
        {
            if (_messages != null && !_messages.IsAddingCompleted)
            {
                _messages.CompleteAdding();
            }
            GC.SuppressFinalize(this);
        }
    }

Here is a diagram that depicts the collection of decorated mailers to intercept the communication between the PurchaseOrderProcessor and the FileSmtpMailer.

SMTP Mailer Decorators

Let's run the code below to evaluate if the queue made a performance difference.
             
            // Use Unity for doing the wiring          
            var fileMailer = new FileSmtpMailer(@"C:\temp", "p@mail.com", "a@mail.com", "Order");
            var maskedMailer = new MaskedMailerDecorator(creditCardRegEx, fileMailer);
            var queuedMailer = new QueuedMailerDecorator(maskedMailer);
            var orderProcessor = new PurchaseOrderProcessor(queuedMailer);

            var stopWatch = Stopwatch.StartNew();
            Parallel.For(0, 1000, i => orderProcessor.Process(request));
            stopWatch.Stop();

            Console.WriteLine("Seconds: {0}", stopWatch.Elapsed.TotalSeconds);
            Console.ReadLine();
Seconds: 6.3908034 

The drawback of the in-memory queue used in this post is that it requires memory to store messages temporarily before it is processed. It is possible to lose messages if the application crashes or stops unexpectedly before all of the requests have been processed.

These issues can be addressed with a locally persisted queue such as MSMQ or a cloud based queue such as Azure Service Bus. Queues provide many benefits that will be covered in the next post.

Summary

This post provided evidence to the performance degradation caused by processing non-critical workloads in real-time. Using queues can be an effective technique to keep an application responsive by buffering and processing tasks in the background.

Null Check Performance Improvement and Failure Reduction

Calling optional dependencies such as logging, tracing and notifications should be fast and reliable.

The Null Object pattern can be used for reducing code complexity by managing optional dependencies with default behaviour as discussed in the Reject the Null Checked Object post.

This post aims to illustrate the problem with the Null Object pattern and how to resolve it using a simple lambda expression.

The problem is that the null check pattern can potentially lead to performance degradation and unnecessary failure.

The solution is to avoid executing unnecessary operations especially for default behaviour.

Download Source Code

Setup

The intent of the sample application is to read and parse XML documents for a book store. The document reader is responsible for logging the book titles using the code below.

    public interface ILogger
    {
        void Log(string message);
    }

    public class ConsoleLogger : ILogger
    {
        public void Log(string message)
        {
            Console.WriteLine(message);
        }
    }

    public class DocumentReader
    {
        private readonly ILogger _logger;

        public DocumentReader(ILogger logger)
        {
            if (logger == null) throw new ArgumentNullException("logger");
            _logger = logger;
        }

        public void Read(XmlDocument document)
        {
            if (document == null) throw new ArgumentNullException("document");
            var books = document.SelectNodes("catalog/book");
            if (books == null) throw new XmlException("Catalog/book missing.");

            _logger.Log(string.Format("Titles: {0}.", 
                        string.Join(", ", GetBookTitles(document))));
            
            // Implementation
        }

        private static IEnumerable<string> GetBookTitles(XmlNode document)
        {
            Console.WriteLine("Retrieving the book titles");
            var titlesNodes = document.SelectNodes("catalog/book/title");
            if (titlesNodes == null) yield break;
            foreach (XmlElement title in titlesNodes)
            {
                yield return title.InnerText;
            }
        }
    }

Problem

Here is an example that illustrates the execution of the application.

 

        static void Main(string[] args)
        {
            var document = new XmlDocument();
            document.LoadXml(@"<catalog>
                                 <book><title>Developer's Guide</title></book>
                                 <book><title>Tester's Guide</title></book>
                               </catalog>");

            var logger = new ConsoleLogger();
            var docReader = new DocumentReader(logger);

            docReader.Read(document);
            Console.ReadLine();
        }
Retrieving the book titles
Book titles: Developer's Guide, Tester's Guide.
The solution works well when an actual logger is used but what happens if we replace the logger with the NullLogger as shown below?
 
    public class NullLogger : ILogger
    {
        public void Log(string message)
        {
            // Purposefully provides no behaviour
        }
    }

    var logger = new NullLogger();
    var docReader = new DocumentReader(logger);

    docReader.Read(document);
    Console.ReadLine();
Retrieving the book titles

Solution

Here is an example that illustrates the improved version. The logger was modified to accept two methods. The first method takes a string for simple logging operations and the second method takes a lambda function that will produce a string.

 
    public interface ILogger
    {
        void Log(string message);
        void Log(Func<string> messageFunc);
    }

    public class NullLogger : ILogger
    {
        public void Log(string message)
        {
            // Purposefully provides no behaviour
        }

        public void Log(Func<string> messageFunc)
        {
            // Purposefully provides no behaviour
        }
    }

    public class ConsoleLogger : ILogger
    {
        public void Log(string message)
        {
            Console.WriteLine(message);
        }

        public void Log(Func<string> messageFunc)
        {
            try
            {
                Console.WriteLine(messageFunc.Invoke());
            }
            catch (Exception ex)
            {
                Console.WriteLine("Failed to log the result. Error: {0}", ex);
            } 
        }
    }

    public class DocumentReader
    {
        public void Read(XmlDocument document)
        {
            _logger.Log(() => string.Format("Titles: {0}.", 
                              string.Join(", ", GetBookTitles(document))));
           ...
         }
    }

Running the example using the Console Logger will produce the same result as the original example.

Let's run the Null Logger example again.

 
    var logger = new NullLogger();
    var docReader = new DocumentReader(logger);

    docReader.Read(document);
    Console.ReadLine();

Summary

The null object checking pattern simplifies solutions by removing the need to check for null. The drawback is the potential risk of impeding performance and causing failure. Passing around lamba expression functions can be a subtle solution to overcome the problem without overcomplicating the code.

Reject the Null Checked Object

Software solutions become overcomplicated with dozens of long and ugly checks for null. It is like a pothole and you will eventually fall in when you’re not looking carefully.

The problem is that methods returns null instead of real objects and every client must check for null to avoid the application from blowing up due to a NullReferenceExcetion (Object reference not set to an instance of an object).

The solution is to return a null object that exhibits default behaviour. This is called the Null Object design pattern.

Download Source Code

Example 1: Debug Mode

Here is an example where additional tasks are performed such as logging verbose messages when the application is in debug mode.

The cyclomatic complexity of the application has increased due to the number checks as shown below.

Null Checking

            if (isDebugMode)
            {
                logger.Log("Saving - Step 1");
            }

            // Save Step 1

            if (isDebugMode)
            {
                logger.Log("Saving - Step 2");
            }

            // Save Step 2
The code can be simplified as shown below since the logger is always called. A real logger can be registered whilst in debug mode otherwise a null logger that logs to nowhere will be used by default.

Null Object Design Pattern

            
public class NullLogger : ILogger
{
    public void Log(string message)
    {
        // Purposefully provides no behaviour
    }
}

logger.Log("Saving - Step 1");
// Save Step 1
logger.Log("Saving - Step 2");
// Save Step 2

Example 2: Registration System

The use case is to create a service to confirm reservations and to send optional confirmation notices. The service is also responsible for retrieving reservations that can be filtered based on the confirmation status.

The following interfaces will be used in the illustration.

    public interface INotification<in T>
    {
        void Notifiy(T message);
    }

    public interface IReservationRepository
    {
        void Save(Confirmation request);
        IEnumerable<Reservation> GetReservations();
    }

Null Checking

Here is the complicated example that performs null checks.

    public class ReservationServiceV1 : IReservationService
    {
        private readonly IReservationRepository _repository;
        private readonly INotification<Confirmation> _notifier;

        public ReservationServiceV1(IReservationRepository repository, 
                                    INotification<Confirmation> notifier)
        {
            if (repository == null) throw new ArgumentNullException("repository");
            _repository = repository;
            _notifier = notifier;
        }

        public void Confirm(Confirmation request)
        {
            _repository.Save(request);
            if (_notifier != null) _notifier.Notifiy(request);
        }

        public IEnumerable<Reservation> GetReservations(bool confirmed)
        {
            var reservations = _repository.GetReservations();
            return reservations == null ? null : 
                   reservations.Where(reservation => reservation.Confirmed == confirmed);
        }
    }

Null Object Design Pattern

Here is the simplified version that works with default behaviour.

 
public class NullNotification<T> : INotification<T>
{
    public void Notifiy(T message)
    {
        // Purposefully provides no behaviour
    }
}
   
public class ReservationServiceV2 : IReservationService
{
    private readonly IReservationRepository _repository;
    private readonly INotification<Confirmation> _notifier;

    public ReservationServiceV2(IReservationRepository repository, 
                                INotification<Confirmation> notifier)
    {
        if (repository == null) throw new ArgumentNullException("repository");
        if (notifier == null) throw new ArgumentNullException("notifier");
        _repository = repository;
        _notifier = notifier;
    }

    public void Confirm(Confirmation request)
    {
        _repository.Save(request);
        _notifier.Notifiy(request);
    }

    public IEnumerable<Reservation> GetReservations(bool confirmed)
    {
        return _repository.GetReservations()
                          .Where(reservation => reservation.Confirmed == confirmed);
    }
}

Summary

Avoid returning null and return default behaviour instead; such as an empty list. The Null Object design pattern will simplify code and reduce potential slip-ups causing unexpected failure.

10x Performance Gain: IEnumerable vs IQueryable

This post compares IEnumerable against IQuerable using an experiment to illustrate the behaviour and performance differences. Spotting a func vs an expression func filter bug is easy to miss. The caller’s syntax stays the same but it could have a 10x performance impact on your application.

Download Source Code

Setup

SQL Server 2014 was used for hosting the database. The GeoAllCountries table content was sourced from GeoNames and contains just over 10 million rows. Entity Framework 6 was used for the LINQ to SQL integration.

Predicate Function

The code below will query the GeoAllCountries table and use a filter predicate function to filter the results starting with "Aus".
        static void Main(string[] args)
        {
            var stopWatch = Stopwatch.StartNew();
            var countryNames = GetCountryNames(name => name.StartsWith("Aus"));

            foreach (var name in countryNames)
            {
                Console.WriteLine(name);
            }

            stopWatch.Stop();
            Console.WriteLine("Running time: {0}", stopWatch.Elapsed.TotalSeconds);
            Console.ReadLine();
        }

        public static IEnumerable<string> GetCountryNames(Func<string, bool> filterFunc)
        {
            using (var context = new TestDatabaseDataContext())
            {
                IQueryable<string> names = (from country in context.GeoAllCountries 
                                            select country.Name);
         
                foreach (var name in names.Where(filterFunc))
                {
                    yield return name;
                }
            }  
        }
Running time: 8.6558463
SQL Server Profiler captured the following query between the application and the database:
SELECT [t0].[Name] FROM [dbo].[GeoAllCountries] AS [t0]

Expression Predicate Function

The code below will query the GeoAllCountries table and use an expression filter predicate function to filter the results starting with "Aus".
        static void Main(string[] args)
        {
            var stopWatch = Stopwatch.StartNew();
            var countryNames = GetCountryNames(name => name.StartsWith("Aus"));

            foreach (var name in countryNames)
            {
                Console.WriteLine(name);
            }

            stopWatch.Stop();
            Console.WriteLine("Running time: {0}", stopWatch.Elapsed.TotalSeconds);
            Console.ReadLine();
        }

        public static IEnumerable<string> GetCountryNames(
                      Expression<Func<string, bool>> filterFunc)
        {
            using (var context = new TestDatabaseDataContext())
            {
                IQueryable<string> names = (from country in context.GeoAllCountries 
                                            select country.Name);
         
                foreach (var name in names.Where(filterFunc))
                {
                    yield return name;
                }
            }  
        }
Running time: 0.8633603
SQL Server Profiler captured the following query between the application and the database:
exec sp_executesql N'SELECT [t0].[Name]
FROM [dbo].[GeoAllCountries] AS [t0]
WHERE [t0].[Name] LIKE @p0',N'@p0 nvarchar(4000)',@p0=N'Aus%'
Note that the client code did not change. Adding the expression syntax around the func made a world of difference. It is pretty easy to add the predicate syntax but is just as easy to miss in a code review unless you have the fidelity to spot the issue and understand the implications.

Summary

IEnumerable executes the select query at the database and filters the data in-memory at the application layer.

IQueryable executes the select query and all of the filters at the database.

The database filtering reduced network traffic and application memory load resulting in a significant 10x performance gain.

Butcher the LINQ to SQL Resource Hog

Has your LINQ to SQL repository ever thrown a "cannot access a disposed object" exception? You can fix it by calling ToList on the LINQ query but it will impede your application’s performance and scalability.

This post covers common pitfalls and how to avoid them when dealing with unmanaged resources such as the lifecycle of a database connection in a pull-based IEnumerable repository. An investigation is made to uncover when Entity Framework and LINQ to SQL resources are disposed of and how to implement an effective solution.

Download Source Code

Setup

The following repository class will be used to model the same behaviour as an actual LINQ to SQL database repository.
    public class Model
    {
        public string Message { get; set; }
    }

    public class Repository : IDisposable
    {
        public IEnumerable<Model> Records
        {
            get
            {
                if (_disposed) throw new InvalidOperationException("Disposed");
                Console.WriteLine("Building message one");
                yield return new Model() { Message = "Message one" };
                if (_disposed) throw new InvalidOperationException("Disposed");
                Console.WriteLine("Building message two");
                yield return new Model() { Message = "Message two" };
            }
        }

        private bool _disposed = false;

        public void Dispose()
        {
            Dispose(true);
            GC.SuppressFinalize(this);
        }
        
        protected virtual void Dispose(bool disposing)
        {
            if (_disposed) return;
            _disposed = true;
        }
    }

LINQ to SQL: Cannot access a disposed object

Let's execute the LINQ query below to call the repository and write the results to the console.
        static void Main(string[] args)
        {
            var records = GetLinqRecords();
            foreach (var record in records)
            {
                Console.WriteLine(record);    
            }
            Console.ReadLine();
        }

        private static IEnumerable<string> GetLinqRecords()
        {
            using (var repository = new Repository())
            {
                return (from model in repository.Records select model.Message);
            }
        }

A LINQ to SQL application would raise the following exception:

An unhandled exception of type 'System.ObjectDisposedException' occurred in System.Data.Linq.dll
Additional information: Cannot access a disposed object.

LINQ to SQL: ToList

Let's execute the LINQ query below by materialising the records to a list first:
        
        static void Main(string[] args)
        {
            var records = GetLinqRecordsToList();
            foreach (var record in records)
            {
                Console.WriteLine(record);    
            }
            Console.ReadLine();
        }

        private static IEnumerable>string< GetLinqRecordsToList()
        {
            using (var repository = new Repository())
            {
                return (from model in repository.Records select model.Message).ToList();
            }
        }
Building message one
Building message two
Message one
Message two

Yield to the rescue

Let's execute the code below using yield instead:
        static void Main(string[] args)
        {
            var records = GetYieldRecords();
            foreach (var record in records)
            {
                Console.WriteLine(record);    
            }
            Console.ReadLine();
        }

        private static IEnumerable<string> GetYieldRecords()
        {
            using (var repository = new Repository())
            {
                foreach (var record in repository.Records)
                {
                    yield return record.Message;
                }
            }
        }
Building message one
Message one
Building message two
Message two

Don’t refactor your code

Let's see what happens when we run a refactored version of the code:
        static void Main(string[] args)
        {
            var records = GetRefactoredYieldRecords();
            foreach (var record in records)
            {
                Console.WriteLine(record);    
            }
            Console.ReadLine();
        }

        private static IEnumerable<string>string<string> GetRefactoredYieldRecords()
        {
            using (var repository = new Repository())
            {
                return YieldRecords(repository.Records);
            }
        }

        private static IEnumerable<string> YieldRecords(IEnumerable<Model> records)
        {
            if (records == null) throw new ArgumentNullException("records");
            foreach (var record in records)
            {
                yield return record.Message;
            }
        }

Déjà Vu. The same error occurred as seen in the LINQ to SQL example. Take a closer look at the IL produced by the compiler using a tool such as ILSpy.

In the refactored and the LINQ to SQL version, instead of returning an IEnumerable function directly, a function is returned that points to another IEnumerable function. Effectively, it is an IEnumerable within an IEnumerable. The connection lifecycle is managed in the first IEnumerable function which will be disposed once the second IEnumerable function is returned to the caller.

Keep it simple, return the IEnumerable function directly to the caller.

Yield IEnumerable vs List Building

This post describes the use of yield and compares it to building and returning a list behind an IEnumerable<T> interface.

Download Source Code

Setup

The example consists of a contact store that will allow the client to retrieve a collection of contacts.

The IStore.GetEnumerator method must return IEnumerable<T>, which is a strongly typed generic interface that describes the ability to fetch the next item in the collection.

The actual implementation of the collection can be decided by the concrete implementation. For example, the collection could consist of an array, generic list or yielded items.

       
    public interface IStore<out T>
    {
        IEnumerable<T> GetEnumerator();
    }

    public class ContactModel
    {
        public string FirstName { get; set; }
        public string LastName { get; set; }
    }

Calling GetEnumerator

Let's create two different stores, call the GetEnumerator on each store and evaluate the console logs to determine if there is a difference between the List Store and the Yield Store.

List Store

The code below is a common pattern I've observed during code reviews, where a list is instantiated, populated and returned once ALL of the records have been constructed.
    
    public class ContactListStore : IStore<ContactModel>
    {
        public IEnumerable<ContactModel> GetEnumerator()
        {
            var contacts = new List<ContactModel>();
            Console.WriteLine("ContactListStore: Creating contact 1");
            contacts.Add(new ContactModel() { FirstName = "Bob", LastName = "Blue" });
            Console.WriteLine("ContactListStore: Creating contact 2");
            contacts.Add(new ContactModel() { FirstName = "Jim", LastName = "Green" });
            Console.WriteLine("ContactListStore: Creating contact 3");
            contacts.Add(new ContactModel() { FirstName = "Susan", LastName = "Orange" });
            return contacts;
        }
    }

    static void Main(string[] args)
    {
        var store = new ContactListStore();
        var contacts = store.GetEnumerator();

        Console.WriteLine("Ready to iterate through the collection.");
        Console.ReadLine();
    }
ContactListStore: Creating contact 1
ContactListStore: Creating contact 2
ContactListStore: Creating contact 3
Ready to iterate through the collection.

Yield Store

The yield alternative is shown below, where each instance is returned as soon as it is produced.
    
    public class ContactYieldStore : IStore<ContactModel>
    {
        public IEnumerable<ContactModel> GetEnumerator()
        {
            Console.WriteLine("ContactYieldStore: Creating contact 1");
            yield return new ContactModel() { FirstName = "Bob", LastName = "Blue" };
            Console.WriteLine("ContactYieldStore: Creating contact 2");
            yield return new ContactModel() { FirstName = "Jim", LastName = "Green" };
            Console.WriteLine("ContactYieldStore: Creating contact 3");
            yield return new ContactModel() { FirstName = "Susan", LastName = "Orange" };
        }
    }

    static void Main(string[] args)
    {
        var store = new ContactYieldStore();
        var contacts = store.GetEnumerator();

        Console.WriteLine("Ready to iterate through the collection.");
        Console.ReadLine();
    }
Ready to iterate through the collection.
Let's call the collection again and obverse the behaviour when we fetch the first contact in the collection.
  
        static void Main(string[] args)
        {
            var store = new ContactYieldStore();
            var contacts = store.GetEnumerator();
            Console.WriteLine("Ready to iterate through the collection");
            Console.WriteLine("Hello {0}", contacts.First().FirstName);
            Console.ReadLine();
        }
Ready to iterate through the collection
ContactYieldStore: Creating contact 1
Hello Bob

Possible multiple enumeration of IEnumerable

Have you ever noticed the "possible multiple enumeration of IEnumerable" warning from ReSharper? ReSharper is warning us about a potential double handling issue, particularly for deferred execution functions such as yield and Linq. Have a look at the results produced from the code below.
  
        static void Main(string[] args)
        {
            var store = new ContactYieldStore();
            var contacts = store.GetEnumerator();
            Console.WriteLine("Ready to iterate through the collection");

            if (contacts.Any())
            {
                foreach (var contact in contacts)
                {
                    Console.WriteLine("Hello {0}", contact.FirstName);
                }
            }
            
            Console.ReadLine();
        }
Ready to iterate through the collection
ContactYieldStore: Creating contact 1
ContactYieldStore: Creating contact 1
Hello Bob
ContactYieldStore: Creating contact 2
Hello Jim
ContactYieldStore: Creating contact 3
Hello Susan

IEnumerable.ToList()

What if we have a requirement to materialize (build) the entire collection immediately? The answer is shown below.

  
        static void Main(string[] args)
        {
            var store = new ContactYieldStore();
            var contacts = store.GetEnumerator().ToList();
            Console.WriteLine("Ready to iterate through the collection");
            Console.ReadLine();
        }
ContactYieldStore: Creating contact 1
ContactYieldStore: Creating contact 2
ContactYieldStore: Creating contact 3
Ready to iterate through the collection

Calling .ToList() on IEnumerable will build the entire collection up front.

Comparison

The list implementation loaded all of the contacts immediately whereas the yield implementation provided a deferred execution solution.

In the list example, the caller doesn't have the option to defer execution. The yield approach provides greater flexibility since the caller can decide to pre-load the data or pull each record as required. A common trap to avoid is performing multiple enumerations on the same collection since yield and Linq functions will perform the same operation for each enumeration.

In practice, it is often desirable to perform the minimum amount of work needed in order to reduce the resource consumption of an application.

For example, we may have an application that processes millions of records from a database. The following benefits can be achieved when we use IEnumerable in a deferred execution pull-based model:

  • Scalability, reliability and predictability are likely to improve since the number of records does not significantly affect the application’s resource requirements.
  • Performance and responsiveness are likely to improve since processing can start immediately instead of waiting for the entire collection to be loaded first.
  • Recoverability and utilisation are likely to improve since the application can be stopped, started, interrupted or fail. Only the items in progress will be lost compared to pre-fetching all of the data where only a portion of the results was actually used.
  • Continuous processing is possible in environments where constant workload streams are added.

Evolution of the Singleton Design Pattern in C#

This post is focused on the evolutionary process of implementing the singleton design pattern that restricts the instantiation to one object.

We will start with a simple implementation and move onto a thread safe example using locking. A comparison is made using Lazy Initialization and we will complete the post with a loosely coupled singleton solution using Dependency Injection.

Download Source Code

Singleton Components

We will use logging as our singleton use-case. The logging interface and implementation shown below will write to the console when the ConsoleLogger is instantiated.

       
    public interface ILogger
    {
        void Log(string message);
    }

    public class ConsoleLogger : ILogger
    {
        public ConsoleLogger()
        {
            Console.WriteLine("{0} constructed", GetType().Name);
        }

        public void Log(string message)
        {
            Console.WriteLine(message);
        }
    }

Single Thread Singleton

Below is a simple implementation of a singleton.

    
    public class SingleThreadLogger
    {
         private static ILogger _instance;

         public static ILogger Instance
         {
             get
             {
                 if (_instance == null)
                 { 
                     _instance = new ConsoleLogger();
                 }
                 return _instance;
             }
         }
    }
Calling the singleton in a single thread from the console:
    
    static void Main(string[] args)
    {
         SingleThreadLogger.Instance.Log("Hello World");
         SingleThreadLogger.Instance.Log("Hello World");     
         Console.ReadLine();
    }
  ConsoleLogger constructed
  Hello World
  Hello World
Calling the singleton using multiple threads from the console:
    
    static void Main(string[] args)
    {
         Parallel.Invoke(() => SingleThreadLogger.Instance.Log("Hello World"),
                         () => SingleThreadLogger.Instance.Log("Hello World"));
         Console.ReadLine();
    }
  ConsoleLogger constructed
  Hello World
  ConsoleLogger constructed
  Hello World

Thread Safe Singleton

Below is an example of a poorly implemented locking approach around the constructor of the ConsoleLogger:
    
    public class ThreadLockConstructorLogger
    {
        private static ILogger _instance;

        private static readonly Object _lock = new object();

        public static ILogger Instance
        {
            get
            {
                if (_instance == null)
                {
                     lock (_lock)
                    {
                         // WARNING: Multiple instantiation
                        _instance = new ConsoleLogger();
                    }       
                }
                return _instance;
            }
        }
    }
Calling the singleton using multiple threads from the console:
    
    static void Main(string[] args)
    {
         Parallel.Invoke(() => ThreadLockConstructorLogger.Instance.Log("Hello World"),
                         () => ThreadLockConstructorLogger.Instance.Log("Hello World")); 
         Console.ReadLine();
    }
  ConsoleLogger constructed
  Hello World
  ConsoleLogger constructed
  Hello World

The ThreadLockConstructorLogger class has the classic double-checked locking bug. The first thread acquired the lock and is busy on line 15 while the second thread is waiting for the lock to be released on line 13. Once the second thread acquires the lock, another instance will be created on line 15.

We can solve the problem by implementing the double locking solution as shown below, where we perform another check to verify that the class has not been instantiated once the lock has been acquired.

    
    public class ThreadLockWriteLogger
    {
        private static ILogger _instance;

        private static readonly Object _lock = new object();

        public static ILogger Instance
        {
            get
            {
                if (_instance == null)
                {
                    lock (_lock)
                    {
                        if (_instance == null)
                        {
                            _instance = new ConsoleLogger();
                        }             
                    }
                }
                return _instance;
            }
        }
    }
Calling the singleton using multiple threads from the console:
    
    static void Main(string[] args)
    {
         Parallel.Invoke(() => ThreadLockWriteLogger.Instance.Log("Hello World"),
                         () => ThreadLockWriteLogger.Instance.Log("Hello World")); 
         Console.ReadLine();
    }
  ConsoleLogger constructed
  Hello World
  Hello World

Lazy Instantiated Singleton

A simplified approach is to use lazy instantiation because by default, Lazy objects are thread-safe as shown below.

    
    public class LazyLogger
    {
        private static readonly Lazy<ILogger> _instance = 
            new Lazy<ILogger>(() => new ConsoleLogger());

        public static ILogger Instance
        {
            get { return _instance.Value; }
        }
    }
Calling the lazy singleton using multiple threads from the console:
    
    static void Main(string[] args)
    {
         Parallel.Invoke(() => LazyLogger.Instance.Log("Hello World"),
                         () => LazyLogger.Instance.Log("Hello World"));
         Console.ReadLine();
    }
  ConsoleLogger constructed
  Hello World
  Hello World

Dependency Injection Singleton

A singleton is considered an anti-pattern due to the following reasons:

  • Hidden dependencies makes it is hard to tell what a class is dependent on when the dependencies are not explicitly defined (ie: the constructor).
  • Tight coupling occurs when singletons are hardcoded into an application as a static method and unnecessary complicates mocking dependencies in automated tests.
  • Single Responsibility Principle is violated since the creation of an object is mixed with the lifecycle management of the application.

Dependency Injection (DI) can remedy the problems above. Let's have a look at the example below that consists of a WriteMessageAction class with a dependency on the ILogger interface.

    
    public interface IAction<in T>
    {
        void Do(T message);
    }

    public class WriteMessageAction : IAction<string>
    {
        private readonly ILogger _logger;

        public WriteMessageAction(ILogger logger)
        {
            if (logger == null) throw new ArgumentNullException("logger");
            _logger = logger;
        }

        public void Do(string message)
        {
            _logger.Log(message);
        }
    }
We can use Unity to manage the lifecycle of the ILogger instance as shown below.
    
static void Main(string[] args)
{
     var container = new UnityContainer();
     container.RegisterType<ILogger, ConsoleLogger>(
                  new ContainerControlledLifetimeManager());
     container.RegisterType<IAction<string>, WriteMessageAction>("One");
     container.RegisterType<IAction<string>, WriteMessageAction>("Two");

     var actions = container.ResolveAll<IAction<string>>();
     actions.ForEach(action => action.Do("Hello World"));

     Console.ReadLine();
}
  ConsoleLogger constructed
  Hello World
  Hello World
The key is on line 4 to 5, since we pass in "new ContainerControlledLifetimeManager()" to register the object as a singleton. By default, unity will create a new instance of an object when we omit the ContainerControlledLifetimeManager constructor parameter as shown below:
    
{
     var container = new UnityContainer();
     container.RegisterType<ILogger, ConsoleLogger>();
     container.RegisterType<IAction<string>, WriteMessageAction>("One");
     container.RegisterType<IAction<string>, WriteMessageAction>("Two");

     var actions = container.ResolveAll<IAction<string>>();
     actions.ForEach(action => action.Do("Hello World"));

     Console.ReadLine();
}
  ConsoleLogger constructed
  ConsoleLogger constructed
  Hello World
  Hello World

Hello World

I'm Jay Strydom, and I love solving complex, real life business problems by producing simple solutions. This has been the basis of my career as a software developer and software architect.

A typical day involves coaching developers, performing code reviews, producing designs and diving into code. I enjoy explaining complex technical topics to business owners in order to solve their issues with secure, scalable and reliable solutions that they understand and agree with.

This blog was created as I'm filled with ideas and want to give something back to the community. I hope it will provide others with new ways of solving problems.  I'm open to critique and discussion to provide a learning experience in a fast moving industry. Feel free to replicate, improve or disprove a solution presented on this blog. It would be great to hear about your experience with a solution or if you notice something that you don't like then I'd love to hear from you too.