Loading...

Follow {coding}Sight | Awesome blog for SQL Server and.. on Feedspot

Continue with Google
Continue with Facebook
or

Valid

This article comprises the second part of my speech at the multithreading meetup. You can have a look at the first part here. In the first part, I focused on the basic set of tools used to start a thread or a Task, the ways to track their state, and some additional neat things such as PLinq. In this part, I will fix on the issues you may encounter in a multi-threaded environment and some of the ways to resolve them.

Contents Concerning shared resources

You can’t write a program which work is based on multiple threads without having shared resources. Even if it works on your current abstraction level, you will find that it actually has shared resources as soon as you go down one or more abstraction levels. Here are some examples:

Example #1:

To avoid possible issues, you make the threads work with different files, one file for each thread. It seems to you that the program has no shared resources whatsoever.

Going a few levels down, you get to know that there’s only one hard drive, and it’s up to the driver or the OS to find a solution for issues with hard drive access.

Example #2:

Having read example #1, you decided to place the files on two different remote machines with physically different hardware and operating systems. You also maintain two different FTP or NFS connections.

Going a few levels down again, you understand that nothing has really changed, and the competitive access issue is now delegated to the network card driver or the OS of the machine on which the program is running.

Example #3:

After pulling out most of your hair over the attempts of proving you can write a multi-threaded program, you decide to ditch the files completely and move the calculations to two different objects, with the links to each of the object available only to their specific threads.

To hammer the final dozen nails into this idea’s coffin: one runtime and Garbage Collector, one thread scheduler, physically one unified RAM, and one processor are still considered shared resources.

So, we learned that it is impossible to write a multi-threaded program with no shared resources on all abstraction levels and on the whole scope of the technology stack. Fortunately, each abstraction level (as a general rule) partially or even fully takes care of the issues of competitive access or just denies it right away (example: any UI framework does not allow working with elements from different threads). So typically, the issues with shared resources appear at your current abstraction level. To take care of them, the concept of synchronization is introduced.

Possible issues in multi-threaded environments

We can classify software errors into the following categories:

  1. The program doesn’t produce a result – it crashes or freezes.
  2. The program gives an incorrect result.
  3. The program produces a correct result but doesn’t satisfy some non-function-related requirement – it spends too much time or resources.

In multi-threaded environments, the main issues that result in errors #1 and #2 are deadlock and race condition.

Deadlock

Deadlock is a mutual block. There are many variations of a deadlock. The following one can be considered as the most common:

While Thread#1 was doing something, Thread#2 blocked resource B. Sometime later, Thread#1  blocked resource A and was trying to block resource B. unfortunately, this won’t ever happen because Thread#2 will only let go of resource B after blocking resource А.

Race-Condition

Race-Condition is a situation when both, the behavior and results of the calculations depend on the thread scheduler of the execution environment

The trouble is that your program can work improperly one time in a hundred, or even in a million.

The things may get worse when issues come in threes. For example, the specific behavior of the thread scheduler may lead to a mutual deadlock.

In addition to these two issues which lead to explicit errors, there are also the issues which, if not leading to incorrect calculation results, may still make the program take much more time or resources to produce the desired result. Two of such issues are Busy Wait and Thread Starvation.

Busy Wait

Busy Wait is an issue that takes place when the program spends processor resources on waiting rather than on calculation.

Typically, this issue looks like the following:

while(!hasSomethingHappened)
    ;

This is an example of an extremely poor code as it fully occupies one core of your processor while not really doing anything productive at all. Such code can only be justified when it is critically important to quickly process a change of a value in a different thread. And by ‘quickly’ I mean that you can’t wait even for a few nanoseconds. In all other cases, that is, all cases a reasonable mind can come up with, it is much more convenient to use the variations of ResetEvent and their Slim versions. We’ll talk about them a little bit later.

Probably, some readers would suggest resolving the issue of one core being fully occupied with waiting by adding Thread.Sleep(1) (or something similar) into the cycle. While it will resolve this issue, a new one will be created – the time it takes to react to changes will now be 0.5 ms on average. On one hand, it’s not that much, but on the other, this value is catastrophically higher than what we can achieve by using synchronization primitives of the ResetEvent family.

Thread Starvation

Thread Starvation is an issue with the program having too many concurrently-operating threads. Here, we’re talking specifically about the threads occupied with calculation rather than with waiting for an answer from some IO. With this issue, we lose any possible performance benefits that come along with threads because the processor spends a lot of time on switching contexts.

You can find such issues by using various profilers. The following is a screenshot of the dotTrace profiler working in the Timeline mode (click to enlarge).

Usually, programs that are not suffering from the thread starvation do not have any pink sections on the charts representing the threads. Moreover, in the Subsystems category, we can see that the program was waiting for CPU for 30.6% of the time.

When such an issue is diagnosed, you can take care of it rather simply: you have started too many threads at once, so just start fewer threads.

Synchronization methods Interlocked

This is probably the most lightweight synchronization method. Interlocked is a set of simple atomic operations. When an atomic operation is being executed, nothing can happen. In .NET, Interlocked is represented by the static class of the same name with a selection of methods, each one of them implementing one atomic operation.

To realize the ultimate horror of non-atomic operations, try writing a program that launches 10 threads, each one of them incrementing the same variable a million times over. When they are done with their job, output the value of this variable. Unfortunately, it will greatly differ from 10 million. In addition, it will be different each time you run the program. This happens because even such simple operation as the increment is not an atomic one, and includes the value extraction from memory, calculation of new value and writing it to the memory again. So, two threads can make any of these operations and an increment will be lost in this case.

The Interlocked class provides the Increment/Decrement methods, and it’s not difficult to guess what they’re supposed to do. They are really handy if you process data in several threads and calculate something. Such code will work much faster than the classic lock. If we used Interlocked in the situation described in the previous paragraph, the program would reliably produce a value of 10 million in any scenario.

The function of the CompareExchange method is not that obvious. However, its existence enables the implementation of a lot of interesting algorithms. Most importantly, the ones from the lock-free family.

public static int CompareExchange (ref int location1, int value, int comparand);

This method takes three values. The first one is passed through a reference and it is the value that will be changed to the second one if location1 is equal to comparand when the comparison is performed. The original value of location1 will be returned. This sounds complicated, so it’s easier to write a piece of code that performs the same operations as CompareExchange:

var original = location1;
if (location1 == comparand)
    location1 = value;
return original;

The only difference is that the Interlocked class implements this in an atomic way. So, if we wrote this code ourselves, we could face a scenario in which the condition location1 == comparand has already been met. But when the statement location1 = value is being executed, a different thread has already changed the location1 value, so it will be lost.

We can find a good example of how this method can be used in the code that the compiler generates for any С# event.

Let’s write a simple class with one event called MyEvent:

class MyClass {
    public event EventHandler MyEvent;
}

Now, let’s build the project in Release configuration and open the build through dotPeek with the “Show Compiler Generated Code” option enabled:

[CompilerGenerated]
private EventHandler MyEvent;
public event EventHandler MyEvent
{
  [CompilerGenerated] add
  {
    EventHandler eventHandler = this.MyEvent;
    EventHandler comparand;
    do
    {
      comparand = eventHandler;
      eventHandler = Interlocked.CompareExchange<EventHandler>(ref this.MyEvent, (EventHandler) Delegate.Combine((Delegate) comparand, (Delegate) value), comparand);
    }
    while (eventHandler != comparand);
  }
  [CompilerGenerated] remove
  {
    // The same algorithm but with Delegate.Remove
  }
}

Here, we can see that the compiler has generated a rather complex algorithm behind the scenes. This algorithm prevents us from losing a subscription to the event in which a few threads are simultaneously subscribed to this event. Let’s elaborate on the add method while keeping in mind what the CompareExchange method does behind the scenes:

EventHandler eventHandler = this.MyEvent;
EventHandler comparand;
do
{
    comparand = eventHandler;
    // Begin Atomic Operation
    if (MyEvent == comparand)
    {
        eventHandler = MyEvent;
        MyEvent = Delegate.Combine(MyEvent, value);
    }
    // End Atomic Operation
}
while (eventHandler != comparand);

This is much more manageable, but probably still requires an explanation. This is how I would describe the algorithm:

If MyEvent is still the same as it was at the moment we started executing Delegate.Combine, then set it to what Delegate.Combine returns. If it’s not the case,  try again until it works.

In this way, subscriptions will never be lost. You will have to solve a similar issue if you would like to implement a dynamic, thread-safe, lock-free array. If several threads suddenly start adding elements to that array, it’s important for all of those elements to be successfully added.

Monitor.Enter, Monitor.Exit, lock

These constructions are used for thread synchronization most frequently. They implement the concept of a critical section: that is, the code written between the calls of Monitor.Enter and Monitor.Exit can only be executed on one resource at one point of time by only one thread. The lock operator serves as syntax-sugar around the Enter/Exit calls wrapped in try-finally. A pleasant quality of the critical section in .NET is that it supports reentrancy. This means that the following code can be executed with no real issues:

lock(a) {
  lock (a) {
    ...
  }
}

It’s unlikely that anyone would write in this exact way, but if you spread this code between a few methods through the depth of the call-stack, this feature can save you a few IFs. For this trick to work, the developers of .NET had to add a limitation – you can only use instances of reference types as a synchronization object, and a few bytes are added to each object where the thread identifier will be written.

This peculiarity of the critical section’s work process in C# imposes one interesting limitation on the lock operator: you can’t use the await operator inside the lock operator. At first, this surprised me since a similar try-finally Monitor.Enter/Exit construction can be compiled. What’s the deal? It’s important to re-read the previous paragraph and apply some knowledge of how async/await works: the code after await won’t be necessarily executed on the same thread as the code before await. This depends on the synchronization context and whether the ConfigureAwait method is called or not. From this, it follows that Monitor.Exit may be executed on a different thread than Monitor.Enter, which will lead to SynchronizationLockException being thrown. If you don’t believe me, try running the following code in a console application – it will generate a SynchronizationLockException:

var syncObject = new Object();
Monitor.Enter(syncObject);
Console.WriteLine(Thread.CurrentThread.ManagedThreadId);

await Task.Delay(1000);

Monitor.Exit(syncObject);
Console.WriteLine(Thread.CurrentThread.ManagedThreadId);

It’s worth noting that, in a WinForms or WPF application, this code will work correctly if you call it from the main thread as there will be a synchronization context that implements returning to UI-Thread after calling await. In any case, it’s better not to play around with critical sections in the context of a code containing the await operator. In such examples, it’s better to use synchronization primitives we will look at a little bit later.

While we’re on the topic of critical sections in .NET, it’s important to mention one more peculiarity of how they’re implemented. A critical section in .NET works in two modes: spin-wait and core-wait. We can represent the spin-wait algorithm like the following pseudocode:

while(!TryEnter(syncObject))
    ;

This optimization is directed towards capturing a critical section as quickly as possible in a short amount of time on the basis that, even if the resource is currently occupied, it will be released very soon. If this doesn’t happen in a short amount of time, the thread will switch to waiting in the core mode, which takes time – just as going back from waiting. The developers of .NET have optimized the scenario of short blocks as much as possible. Unfortunately, if many threads start pulling the critical section between themselves, it can lead to a sudden high load on CPU.

SpinLock, SpinWait

Having already mentioned the cyclic wait algorithm (spin-wait), it’s worth talking about the SpinLock and SpinWait structures from BCL. You should use them if there are reasons to suppose it will always be possible to get a block very quickly. On the other hand, you shouldn’t really think about them until the profiling results will show that your program’s bottleneck is caused by using other synchronization primitives.

Monitor.Wait, Monitor.Pulse[All]

We should look at these two methods side-by-side. With their help, you can implement various Producer-Consumer scenarios.

Producer-Consumer is a pattern of multi-process/multi-threaded design implying one or more threads/processes which produce data and one or more processes/threads which process this data. Usually, a shared collection is used.

Both of these methods can only be called by a thread which currently has a block. The Wait method will release the block and freeze until another thread will call Pulse.

As a demonstration of this, I wrote a little example:

object syncObject = new object();
Thread t1 = new Thread(T1);
t1.Start();

Thread.Sleep(100);
Thread t2 = new Thread(T2);
t2.Start();

(I used an image rather than text here to accurately show the instruction execution order)

Explanation: I set a 100-ms latency when starting the second thread to specifically guarantee that it will be executed later.

  • T1:Line#2 the thread is started
  • T1:Line#3 the thread enters a critical section
  • T1:Line#6 the thread goes to sleep
  • T2:Line#3 the thread is started
  • T2:Line#4 it freezes and waits for the critical section
  • T1:Line#7 it lets the critical section go and freezes while waiting for Pulse to come out
  • T2:Line#8 it enters the critical section
  • T2:Line#11 it signals T1 with the help of Pulse
  • T2:Line#14 it comes out of the critical section. T1 cannot continue its execution before this happens.
  • T1:Line#15 it comes out from waiting
  • T1:Line#16 it comes out from the critical section

There is an important remark in MSDN regarding the use of the Pulse/Wait methods: Monitor doesn’t store the state information, which means that calling the Pulse method before the Wait method can lead to a deadlock. If such a case is possible, it’s better to use one of the classes from the ResetEvent family.

The previous example clearly shows how the Wait/Pulse methods of the Monitor class work, but still leaves some questions about the cases in which we should use them. A good example is this implementation of BlockingQueue<T>. On the other hand, the implementation of BlockingCollection<T> from System.Collections.Concurrent uses SemaphoreSlim for synchronization.

ReaderWriterLockSlim

I dearly love this synchronization primitive, and it’s represented by the class of the same name from the System.Threading namespace. I think that a lot of programs would work much better if their developers used this class instead of the standard lock.

Idea: a lot of threads can read, and the only one can write. When a thread wants to write, new reads cannot be started – they will be waiting for the writing to the end. There is also the upgradeable-read-lock concept. You can use it when, during the process of reading, you understand there is a need to write something – such a lock will be transformed into a write-lock in one atomic operation.

In the System.Threading namespace, there is also the ReadWriteLock class, but it’s highly recommended not to use it for new development. The Slim version will help avoid cases which lead to deadlocks and allows to quickly capture a block as it supports synchronization in the spin-wait mode before going to the core mode.

If you didn’t know about this class before reading this article, I think by now, you have remembered a lot of examples from the recently-written code where this approach to blocks allowed the program to work effectively.

The interface of the ReaderWriterLockSlim class is simple and easy to understand, but it’s not that comfortable to use:

var @lock = new ReaderWriterLockSlim();

@lock.EnterReadLock();
try
{
    // ...
}
finally
{
    @lock.ExitReadLock();
}

I usually like to wrap it in a class – this makes it much handier.

Idea: create Read/WriteLock methods which return an object along with the Dispose method. You can then access them in Using, and it probably won’t differ too much from the standard lock when it comes to the number of lines.

class RWLock : IDisposable
{
    public struct WriteLockToken : IDisposable
    {
        private readonly ReaderWriterLockSlim @lock;
        public WriteLockToken(ReaderWriterLockSlim @lock)
        {
            this.@lock = @lock;
            @lock.EnterWriteLock();
        }
        public void Dispose() => @lock.ExitWriteLock();
    }

    public struct ReadLockToken : IDisposable
    {
        private readonly ReaderWriterLockSlim @lock;
        public ReadLockToken(ReaderWriterLockSlim @lock)
        {
            this.@lock = @lock;
            @lock.EnterReadLock();
        }
     ..
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Security is one of the most important requirements for a data-driven system. Encryption is one of the ways to secure the data. Wikipedia defines encryption as:

Encryption is the process of encoding a message or information in such a way that only authorized parties can access it and those who are not authorized cannot.

In SQL Server 2016, Microsoft introduced an encryption feature called Always Encrypted. In this article, we will see what Always Encrypted is, and how it can be used to encrypt and decrypt data, with the help of simple examples.

What is SQL Server Always Encrypted?

Always Encrypted is a security feature that allows the client application to manage the encryption and decryption keys, thus ensuring that only the client application can decrypt and use sensitive data.

Several encryption techniques exist, however they are not as secure as Always Encrypted. For instance, symmetric key encryption is used to encrypt data on the database side. A drawback of this approach is that if any other database administrator has the decryption key, he can access the data.

On the other hand, in case of Always Encrypted, the data is encrypted on the client side and the database server receives a ciphered version of the data. Hence, the data cannot be deciphered at the database end. Only the client that encrypted the data can decrypt it.

Key Types

SQL Server Always Encrypted feature uses two types of keys:

  • Column Encryption Key (CEK)

It is always placed on the database server. The data is actually encrypted using column CEK. However, if someone on the database side has access to CEK, he can decrypt the data.

  • Column Master Key (CMK)

This key is placed on the client side or any third party storage. CMK is used to protect the CEK, adding an additional layer of security. Whoever has access to CMK can actually decrypt the CEK which can then be used to decipher the actual data.

Encryption Types

  • Deterministic

This type of encryption will always generate similar encrypted text for the same type of data. If you want to implement searching and grouping on a table column, use deterministic encryption for that column.

  • Randomized

Randomized Encryption will generate different encrypted text for the same type of data, whenever you try to encrypt the data. Use randomized encryption if the column is not used for grouping and searching.

Configuring Always Encrypted Using SSMS

We can configure SQL Server Always Encrypted via SSMS. But before that, we need to create a database and add a table to the database. Execute the following script to do so:

CREATE DATABASE School

Use School
CREATE TABLE Student  
(  
   StudentId int identity(1,1) primary key,  
   Name varchar(100),  
   Password varchar(100) COLLATE Latin1_General_BIN2 not null,  
   SSN varchar(20)  COLLATE Latin1_General_BIN2 not null
)

In the script above, we create a new database named School. The database has four columns: StudentId, Name, Password, and SSN. You can see that the Password and SSN columns have a COLLATE. It is necessary to specify the COLLATE for the column that you want Always Encrypted. The type of encryption is specified as “Latin1_General_BIN2”.

Let’s now first try to add two records into the Student table.

insert into Student ( Name, Password, SSN)
VALUES ('John','abc123', '451236521478'),
('Mike','xyz123', '789541239654')

At this point of time, we have not configured Always Encrypted on any of the columns in the Student table, therefore if you try to select the records from the Student table, you will see the actual data values rather than the encrypted values. Execute the following query to select records:

SELECT * FROM Student
The output looks like this:

Let’s now configure SSMS for to enable Always Encrypted. As we said earlier, Always Encrypted creates column encryption keys and column master keys.

To see the existing column encryption keys and column master keys, for the School Database, go to Databases -> School -> Security -> Always Encrypted Keys as shown in the following figure:

Since you don’t have any encrypted record in the dataset, you won’t see any CEK or CMK in the list.

Let’s now enable encryption on the Password and SSN columns of the Student table. To do so, Right Click on Databases -> School. From the dropdown menu, select Encrypt Columns option as shown in the figure below:

Click Next button on the Introduction window. From the Column Selection window, check Password and SSN columns. For the Password column, select the encryption type as Randomized. For SSN column, choose Deterministic. This is shown in the following screenshot:

Click the Next button on the Master Key Configuration window. By default, the master key is stored on the client machine as shown below:

Click the Next button on the Run Settings and the Summary windows. If everything goes fine, you should see the following Results window.

Now if you again go to Databases -> School -> Security -> Always Encrypted Keys, you should see the newly created CEK and CMK as shown in the following figure:

Now try to select records from the Student table.

SELECT * FROM Student
The output looks like this.

From the output, you can see that the Password and SSN columns have been encrypted.

Retrieving Decrypted Data

The SELECT query returned encrypted data. What if you want to retrieve data in decrypted form? To do so create a New Query Window in SSMS and then click the Change Connection icon at the top of Object Explorer as shown in the following figure:

SQL Server connection window will appear. Select Options button from the bottom right as shown below:

From the window that appears, click on Additional Connection Parameters tab from the top left and enter “Column Encryption Setting = Enabled” in the text box as shown in the following screenshot. Finally, click the Connect button.

Now again execute the following SELECT query:

SELECT * FROM Student
In the results, you will see the records returned in decrypted form as shown below:

Conclusion

Always Encrypted is one of the latest security features of SQL Server. In this article, we briefly reviewed what Always Encrypted is and how to enable it using SQL Server Management Studio. We also saw a basic example of encrypting and decrypting data using Always Encrypted feature.

The post Understanding SQL Server Always Encrypted appeared first on {coding}Sight.

  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Database backups, integrity checks, and performance optimizations are core regular tasks of DBAs. The client data is very important for a DBA to manage the database backup and make sure the integrity of the backups. So if something goes wrong with a production database, it can be recovered with minimum downtime. The database integrity checks are also important because, in the case of database corruption, it can be corrected with minimum downtime and data loss. Managing database performance is also important. Managing database performance is a combination of multiple tasks.

  1. Identify the list of resource-intensive queries and help developers to re-writing them.
  2. Create and manage indexes on the table. Also, perform index defragmentation to make sure they remain in good shape.
  3. Finally, managing statistics of tables.

In my previous article, I covered the topic of Auto create statistics and Auto Update Statistics and how they can help to improve performance. In this article, I am going to explain how to create and schedule the maintenance plan to update the statistics.

First, let me explain what SQL Server statistics are and how it may help to increase the performance of the SQL server.

SQL Server Statistics and their importance

Statistics are metadata used by SQL Server query optimizer, which helps to determine the best way to retrieve the data. The optimizer uses statistics to understand the data, its distribution, and the number of rows a given query is likely to return from the available statistics. Based on this information, it decides the optimal data access path. It also determines whether to do a table scan or an index seek, use nested loop join or a hash join, etc.

If statistics are out of date, or if they are unavailable, the optimizer may choose the poor execution plan, which reduces the query performance significantly. SQL Server can automatically maintain statistics and refresh them based on its tracking of data modifications.

Statistics can be created and updated automatically by enabling “Auto Create Statistics” and “Auto Update Statistics.” However, for some tables, such as those subject to significant changes in data distribution, it’s possible that SQL Server automatic statistics update will be insufficient to maintain consistently high levels of query performance.

Before I explained the different approaches to update the statistics, let me explain the different ways to review the statistics created on any table.

How to review the statistics

We can view column statistics and index statistics

  1. Using SQL Server Management Studio.
  2. Using System stored procedures and system catalogs and dynamic management views
View Statistics using SQL Server Management Studio

For example, I want to see the statistics created on the [HumanResources].[Employee] table created in the AdventureWorks2017 database. To do that, launch SQL Server Management Studio. Then expand the AdventureWorks2017 database, expand the [HumanResources].[Employee] table. See the following image:

Using System stored procedures and dynamic management views

If you’re using an older version of SQL Server, you can use the sp_helpstats system procedure to review the statistics of the tables. sp_helpstats will show the statistics, created by SQL Server or by a user. It will not show the statistics created by Index. To demonstrate that, I have created a statistics named User_Statistics_BirthDate on the [HumanResources].[Employee] table.

Following is the example:

USE ADVENTUREWORKS2017 
GO
EXEC SP_HELPSTATS 
  'HUMANRESOURCES.EMPLOYEE'

Following is the output.

You can review statistics by querying the sys.stats system catalog. It provides information about the statistics created by SQL Server, Created by Indexes and created by a user.

Execute the following query:

SELECT NAME         AS 'STATISTICS NAME', 
       AUTO_CREATED AS 'CREATED AUTOMATICALLY', 
       USER_CREATED AS 'CREATED BY USER' 
FROM   SYS.STATS 
WHERE  OBJECT_ID = OBJECT_ID('HUMANRESOURCES.EMPLOYEE')

Following is the output:

Now, let’s join this table with other system catalogs to get detailed information about the statistics. To do that, execute the following query:

SELECT [SCHEMAS].[NAME] + '.' + [OBJECTS].[NAME] AS [TABLE_NAME], 
       [INDEXES].[INDEX_ID]                      AS [INDEX ID], 
       [STATS].[NAME]                            AS [STATISTIC], 
       STUFF((SELECT ', ' + [COLUMNS].[NAME] 
              FROM   [SYS].[STATS_COLUMNS] [STATS_COLUMN] 
                     JOIN [SYS].[COLUMNS] [COLUMNS] 
                       ON [COLUMNS].[COLUMN_ID] = [STATS_COLUMN].[COLUMN_ID] 
                          AND [COLUMNS].[OBJECT_ID] = [STATS_COLUMN].[OBJECT_ID] 
              WHERE  [STATS_COLUMN].[OBJECT_ID] = [STATS].[OBJECT_ID] 
                     AND [STATS_COLUMN].[STATS_ID] = [STATS].[STATS_ID] 
              ORDER  BY [STATS_COLUMN].[STATS_COLUMN_ID] 
              FOR XML PATH('')), 1, 2, '')       AS [COLUMNS_IN_STATISTIC] 
FROM   [SYS].[STATS] [STATS] 
       JOIN [SYS].[OBJECTS] AS [OBJECTS] 
         ON [STATS].[OBJECT_ID] = [OBJECTS].[OBJECT_ID] 
       JOIN [SYS].[SCHEMAS] AS [SCHEMAS] 
         ON [OBJECTS].[SCHEMA_ID] = [SCHEMAS].[SCHEMA_ID] 
       LEFT OUTER JOIN [SYS].[INDEXES] AS [INDEXES] 
                    ON [OBJECTS].[OBJECT_ID] = [INDEXES].[OBJECT_ID] 
                       AND [STATS].[NAME] = [INDEXES].[NAME] 
WHERE  [OBJECTS].[OBJECT_ID] = OBJECT_ID(N'HUMANRESOURCES.EMPLOYEE') 
ORDER  BY [STATS].[USER_CREATED] 

GO

The query above populates the following details

  1. The table on which statistics are created.
  2. Index ID.
  3. Name of Statistics.
  4. Columns included in statistics.

Following is the output:

In the next section, I will explain different ways to update the statistics.

Different approaches to statistics update

We can update the statistics in the following ways:

  1. Create a SQL Server maintenance plan.
  2. Create Using the custom script.
  3. Manually execute the update statistics command on an individual table.

Firstly, I will explain how we can create a maintenance plan to update the statistics.

Create SQL Server Maintenance Plan to Update the Statistics

Now, in this demo, we will create an update statistics maintenance plan to update the statistics. The maintenance plan will perform the following tasks:

First Create a maintenance plan. To do that, open SQL Server Management Studio. Expand SQL server instance >>Management folder >> Under Management, right-click MaintenancePplans, and select New Maintenance Plan. See the following image:

The New MaintenancePlan dialog box opens. In the box, provide a name of the maintenance plan, and click OK. See the following image:

Maintenance plan designer opens. In maintenance plan designer toolbox, drag and drop “Update Statistics Task” in the designer window. See the following image:

Now double-click Update Statistics Task. The Update Statistics Task dialog box opens. In the dialog box, there are options which can be used to customize the maintenance plan. See the following image:

We can customize the following options on using Update statistics Maintenance plan.

  1. Update statistics of all objects of a specific database. See the following image:
  2. Specific objects of selected databases. You can update statistics of All Tables and Views / Specific tables and views. See the following image:

    If we choose Tables or Views than SQL will fill the name of views or tables in the selection dialog box.  See the following image:
  3. The third option is Update. We can update all statistics of tables/views, or we can choose to update only column statistics (statistics created on nonindexed columns), or we can choose Index statistics only (statistics created by indexes). See the following image:
  4. We can also select the scan type of any statistics. We can choose Full Scan or Sample by a specified percentage or specified rows. See the following image:

Now, as I mentioned, we will create a maintenance task which will update statistics of all tables within the AdventureWorks2017 database with a full scan. So, choose options accordingly. Once the maintenance task is configured, it looks like the following:

Once the maintenance task is configured properly, close the update statistics dialog. After configuration, the maintenance plan looks like the following:

Once Maintenance plan is created, let us schedule the maintenance plan. To do that, click the calendar icon opposite to the Description column. See the following image:

Once you click the calendar button, a dialog box to configure the job schedule opens. See the following image:

As I mentioned, our update statistics maintenance plan will execute on every Sunday at 4:00 AM. So we will configure the job schedule accordingly. Job scheduling is straightforward. Once you configure the schedule, the dialog box looks like the following:

Once an execution schedule is configured, the entire maintenance plan looks like the following image. Save the maintenance plan and close the window.

Now, let us run this maintenance plan by executing the SQL Job created by the maintenance plan. To open SQL Jobs, expand SQL Server Agent and expand Jobs. You can see the SQL Job created by SQL maintenance plan. Now to execute the job, right click Update Statistics Weekly.Weekly.Subplan_1 and click Start Job at Step. See the following image.

Once the job is completed, you can see the following Job execution successful dialog box.

Summary

In this article, I have covered:

  1. A detailed explanation of SQL Server Statistics and its importance.
  2. Different options to update the statistics.
  3. A working example of creating a SQL Maintenance plan to update the statistics.

In my next article, I will explain various T-SQL commands to update the statistics. Moreover, I will explain a T-SQL Script which will update the statistics based on the volume of data changes that occurred after insert/update/ delete occurred on the table.

The post Update SQL Server statistics using a database maintenance plan appeared first on {coding}Sight.

  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

This article talks about how to create a basic customer-focused report using SQL Server Reporting Services (SSRS).

The article also highlights the importance of utilizing the full potential of the SSRS reporting technology by using parameters in SSRS reports – for example, to meet business specifications.

The main focus of this article is to prepare the readers who already have a T-SQL background to create SSRS reports for customers. In this way, the customers can run these reports based on their desired input values rather than on default values provided by report developers.

About Report Parameters

Let us first get familiar with report parameters in general.

Microsoft Definition (SSRS 2016 and Later)

A report parameter provides a way to choose report data, connect related reports together, and vary the report presentation.

Simple Definition

Report parameters help end users to run reports based on their input values.

In other words, report parameters ask the end user to provide values for some defined characteristics so that the report can be ran based on these user-supplied values.

Example

A simple example of a report parameter would be in a report which shows total sales for a particular year based on the value provided by the end user.

In this example, year is the report parameter which can either have a default value or a value specifically provided by the report’s end user.

In other words, this report is more like a dynamic report since it produces different results based on different parameter values provided at run time rather than at design time.

Types of Report Parameters

According to the Microsoft documentation, a report parameter can have one of the following types:

  1. Boolean (True of False)
  2. DateTime (Any valid date and time)
  3. Integer (Any valid whole number)
  4. Float (Any valid number with potential decimal place)
  5. Text (Any text)
How Report Parameters can be created

As per the Microsoft documentation, report parameters can be created in one of the following ways:

  • Automatically
  • Manually
Automatically
  1. When the query that runs behind the report in the form of a data set already contains parameters
  2. When a shared data set is added to a report which already contains parameters

Please note that, in this article we will specifically focus on the automatic method of creating report parameters.

Manually

You can add report parameters manually in the Report Data Pane.

Design vs Published Version

Please remember that the report parameters are saved in a rdl report file at the design time. However, when the report is published, the report parameters are handled differently and are no longer bound to a part of the report file.

This helps business users to directly change report parameters after the report gets published to the report server.

Creating an SSRS report Pre-requisites

This article assumes that you have a basic knowledge of T-SQL scripts and are also familiar with the basics of report design using one of the following tools:

  1. SQL Server Data Tools (SSDT)
  2. Report Builder
  3. 3rd-Party Report Building Tools such as dbForge Studio for SQL Server

I strongly recommend you to have your report server configured at this point to publish the reports quickly, although this is not mandatory to follow the instructions in this article.

Please read my article SSRS Reports Development in Simple Words to get a general understanding of how to create a simple SSRS report.

Setup the sample database

First of all, we’ll need a database that will be used as the data source for the SSRS report.

Please create and populate a sample database called SQLDevBlogV5 as follows:

-- Create the SQLDevBlogV5 sample database
CREATE DATABASE SQLDevBlogV5;
GO


USE SQLDevBlogV5;

-- (1) Create the Article table in the sample database
CREATE TABLE Article (
  ArticleId INT PRIMARY KEY IDENTITY (1, 1)
 ,Category	VARCHAR(50)
 ,Author VARCHAR(50)
 ,Title VARCHAR(150)
 ,Published DATETIME2
 ,Notes VARCHAR(400)  
)

GO

SET IDENTITY_INSERT [dbo].[Article] ON
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (1, N'Development', N'Atif', N'Introduction to T-SQL Programming ', N'2017-01-01 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (2, N'Testing', N'Peter', N'Database Unit Testing Fundamentals', N'2017-01-10 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (3, N'DLM', N'Sadaf', N'Database Lifecycle Management for beginners', N'2017-01-20 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (4, N'Development', N'Peter', N'Common Table Expressions (CTE)', N'2017-02-10 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (5, N'Testing', N'Sadaf', N'Manual Testing vs. Automated Testing', N'2017-03-20 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (6, N'Testing', N'Atif', N'Beyond Database Unit Testing', N'2017-11-10 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (7, N'Testing', N'Sadaf', N'Cross Database Unit Testing', N'2017-12-20 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (8, N'Development', N'Peter', N'SQLCMD - A Handy Utitliy for Developers', N'2018-01-10 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (9, N'Testing', N'Sadaf', N'Scripting and Testing Database for beginners ', N'2018-02-15 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (10, N'Development', N'Atif', N'Advanced Database Development Methods', N'2018-07-10 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (11, N'Testing', N'Sadaf', N'How to Write Unit Tests for your Database', N'2018-11-10 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (12, N'Development', N'Peter', N'Database Development using Modern Tools', N'2018-12-10 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (13, N'DLM', N'Atif', N'Designing, Developing and Deploying Database ', N'2019-01-01 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (14, N'DLM', N'Peter', N'How to Apply Database Lifecycle Management  ', N'2019-02-10 00:00:00', NULL)
SET IDENTITY_INSERT [dbo].[Article] OFF
Business Requirements and Analysis Statement

Suppose you have just received a business requirement as follows:

As a business user, I want to view a report on all technical articles for a specified year”

Preliminary Analysis

This business requirement clearly states that the end user of the report would like to input the year value at run time so that they can view the report on all articles for the specified year.

This leaves us no option except to create an SSRS report with a parameter so that the end user will be able to supply the year at run time and see the report.

Please try to look for keywords such as specified year in the business requirement – this gives you a hint that a report with a year parameter is required.

Create an SSRS Database Report with Parameters

Let’s begin by creating a Reports Server Project in SQL Server Data Tools (SSDT).

If you are interested in seeing a detailed walkthrough of how to create a basic SSRS report, then please go through my article SSRS Reports Development in Simple Words.

Create a New Report Server Project

Open Visual Studio and create a new Report Server Project called Articles Report with Parameters Project under the Business Intelligence template:

Add a New Shared Data Source

As with any other SSRS report project, we need to create either a shared or embedded data source to let the project know from where we are sourcing data for the report.

Right-click Shared Data Sources under Articles Report with Parameters, click Add New Data Source and establish a connection with the sample database (SQLDevBlogV5):

Design a Dataset (Report) Query with Parameters

Once the data source has been setup, it is time to design your T-SQL query with parameters.

In order to design and test a parameterized query, you need to declare a year variable to be used in T-SQL with the help of the WHERE clause.

Open a new query window and run the following script against SQLDevBlogV5 to make sure it pulls the desired data from the database:

-- View Articles by year using the Year variable

DECLARE @Year INT -- declare the year variable to be used in the query
SET @Year=2018 – initialise the year variable 

-- View articles based on the year value (2018)
SELECT * FROM dbo.Article WHERE year(published)=@Year

Now, run the query to see the results:

T-SQL script with a variable works well, so we can exclude the variable declaration and initialization and move on with only the following part of the script remaining:

SELECT * FROM dbo.Article WHERE year(published)=@Year

Please remember that this dataset query is going to generate the parameter automatically.

Add a new Dataset including Year

Right-click Shared Datasets under the project node and click Add New Dataset. Then, add a new dataset DSet_ArticlesByYear and enter the query in the corresponding input field:

Configure the Dataset Parameter

Now, click the Parameters section, set Data Type to integer, and Click OK.

Add a new report with a parameter

Right-click Reports, choose Add, and then click New Item… as follows:

Enter the report’s name (‘ArticlesByYearReport’) and Click Add:

Have a look at the blank report we just created:

Configure the Report Data Pane

Go to the Report Data Pane which appears as soon as the report is created, right-click on Data Sources, and Click Add Data Source…

Name the data source DS_ArticlesByYear, select the shared data source we created earlier in this project, and Click OK:

Right-click the Report Data pane, click Add Dataset…, choose the Use a shared dataset option, and Click OK:

Add Page Header

Right-click the report design surface, click Insert, and then click Page Header:

Next, right-click on the page header section, click Insert, then click the text box and write “Articles By Year Report”:

Add a Table, Drag & Drop the Fields

Right-click the report’s design surface, click Insert, and then click Table as follows:

Next, drag the fields from the Dataset onto the table one-by-one and make the headers bold from the toolbar menu as follows:

Run the report

Let’s do some more formatting before we run the report.

Click Preview, enter 2019, and click View Report:

Congratulations! You have successfully created an SSRS report with a parameter to show all articles written by the authors based on the year value provided at run time by the business user.

Things to do

Now that you are familiar with SSRS development fundamentals, please try the following:

  1. Create an SSRS report with the Category name as a parameter
  2. Create an SSRS report with the Author name as a parameter
  3. Keeping the example in this article in mind, please create one SSRS report with the student name as a parameter and another report with the course name as a parameter based on the sample database named TechnicalTraining mentioned in this article

The post Creating Customer-Focused SSRS Reports with Parameters appeared first on {coding}Sight.

  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

The process of documenting a SQL Server database is a complete and continuous process that should start during the database design and development phases and continue during all database related life cycles in a way that ensures having an up-to-date version of the database documentation that reflects reality at any point in time. If performed properly, the generated database documentation file will contain an up to date and complete list for the database objects and a brief description for these database objects.

The SQL Server database documentation process can be performed using multiple ways. You can simply create a database diagram that shows a list of all database tables and columns and updates this diagram when any change is performed. But reading and maintaining such a diagram is not an easy process for large databases with dozens of database tables with each table contains dozens of columns.

Starting from SQL Server 2005, Microsoft introduced a new feature called the Extended Properties, that is stored in the database itself and accessed using sys.extended_properties system object, and return metadata information associated with the specified database or database objects. Documenting the SQL Server database using the extended properties is not the best choice, as you can document one database at a time, no historical data as the database objects properties will be deleted when this object is deleted, it is not a piece of cake process as it requires good development skills, requires big effort and consume long time.

Using Visual Studio

Developers who are familiar with Microsoft Visual Studio can easily take benefits from the SQL Server project type to connect to a SQL Server database and check the metadata about the database objects.

To achieve that, open the Visual Studio tool and create a new SQL Server Database Project from the New Project window, as below:

In the New Project window, provide a unique name for that project, where to save this project then click OK to create the SQL Server Database Project. When the project is created, click on the project properties and configure the Target Platform value with the SQL Server version of the target database, as shown below:

To connect to a specific database, right-click  the created project and choose Import -> Database option as follows:

From the Import Database window, select a connection from the previously saved connections list or provide the server name, authenticated credentials and the database name to connect to the database to be documented, as follows:

When you connect on the Connect button, the tool will start collecting metadata information about all database objects, as shown below:

After collecting and importing all database information, the selected database objects will be displayed in the solution explorer, categorized per schema, as follows:

To view metadata information about any database object, expand the schema from the solution explorer and click  that object and a new window will be opened showing all description for the selected object, with T-SQL script to create that specific object, as shown below:

It also provides you with the ability to show the Description column for each database object, by right-clicking on the free space beside the selected table and choosing the Description option. A new column will be displayed showing description for each column, with the ability to edit the description, as shown below:

Although it is too easy to document your database using Visual Studio, it does not provide a centralized place to check multiple database objects, provides information about single database per each project and cannot be exported to a user-friendly or printable format!

Using dbForge Documenter for SQL Server

To save your time and effort and have your database documentation up to date, it is better to use a 3rd party tool that makes the documentation process easier. dbForge Documenter for SQL Server is a database documentation tool that can be easily connected to your database and generates documentation of all SQL Server database objects in a few clicks.

dbForge Documenter for SQL Server provides us with a wide range of style templates and options that help in customizing the generated documentation in order to meet your own requirements. In few seconds of configuration, dbForge Documenter for SQL Server extracts all information and extensive details about the selected database, as well as inter-object dependencies and DDL T-SQL scripts to create these objects, with the ability to export the documentation in searchable HTML, PDF, and Markdown file formats. HTML format helps in publishing the database on the web, and PDF format is suitable for distributing to other systems and shared to other devices. dbForge Documenter for SQL Server can be also accessed to document the database directly via the SQL Server Management Studio as it is integrated with SSMS.

dbForge Documenter for SQL Server can be downloaded from the Devart download center and installed to your server by going through the straight-forward installation wizard, as below:

When you click on the Install button to start the installation process, you will be asked to specify the installation path for the tool, if you manage to create a desktop icon for the tool to access it faster, the versions of SQL Server Management Studio to have this tool as add-in on it, the files extensions that will be associated with the dbForge Documenter for SQL Server tool and finally you will be asked to specify the startup optimization mode for the tool. After that, the installation process will start, with a useful progress bar that shows what is being installed now, as shown below:

When the installation process completed successfully, the wizard will notify you and provide you with an option to launch the tool directly, as follows:

The first view of dbForge Documenter for SQL Server will be similar to the window below. To create documentation for your database using dbForge Documenter for SQL Server, click on the New Documentation window from the welcome page, as below:

In the opened documentation window, click on Add Connection to select an existing connection or adding a new connection, by providing the name of the server, valid credentials and the name of the database to connect to, using the friendly page below:

After connecting successfully to the database, dbForge Documenter for SQL Server will list all databases and database objects under the connected SQL Server instance. In the beginning, it provides you with an option to provide a unique name and description for the documentation to be generated, in addition to your own logo, name, and date to be displayed in that documentation, as shown below:

To document a specific database or database objects, check the name of the database from the databases list, review and tune the different database properties and options to be included in the documentation, but turning on or off the include button beside each property and option, as shown below:

After customizing what to include in your documentation, click on the Generate option to generate database documentation, based on your selections, as follows:

In the Generate Documentation window, specify the format of the generated documentation and the path and characteristic of the generated file name, as shown below:

If you click the Generate button, the documentation generation process will start, with a user-friendly checklist and progress bar to show the current status of the generation process, as below:

When the documentation generation process completed successfully, dbForge Documenter for SQL Server will notify you with the final result, as below:

Browsing to the path where the file is saved, you will see that the database documentation is generated under that path in PDF format, as shown below:

The report will be opened also in the dbForge Documenter for SQL Server tool, showing description for the database, list of all database objects and files and the properties and options for the selected database, as shown below:

dbForge Documenter for SQL Server provides you also with the ability to dive deeply on each database object. For example, click on the Tables hyperlink, choose the table you are interested in and full information about the selected table will be displayed in the report, as shown below:

It is clear from the example below, how we can use the dbForge Documenter for SQL Server 3rd party tool in few clicks to generate customizable documentation for your databases, that can be used for multiple purposes. Go and try documenting your database using dbForge Documenter for SQL Server!

The post How to Document Your SQL Server Database appeared first on {coding}Sight.

  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
Introduction

Developers are often told to use stored procedures in order to avoid the so-called ad hoc queries which can result in unnecessary bloating of the plan cache. You see, when recurrent SQL code is written inconsistently or when there’s code that generates dynamic SQL on the fly, SQL Server has a tendency to create an execution plan for each individual execution. This may decrease overall performance by:

  1. Demanding a compilation phase for every code execution.

  2. Bloating the Plan Cache with too many plan handles that may not be reused.

Optimize for Ad Hoc Workloads

One way this problem was handled in the past is Optimizing the instance for Ad Hoc Workloads. Doing this can only be helpful if most databases or most significant databases on the instance are predominantly executing Ad Hoc SQL.

Fig. 1 Optimize for Ad Hoc Workloads

--Enable OFAW Using T-SQL 

EXEC sys.sp_configure N'show advanced options', N'1'  RECONFIGURE WITH OVERRIDE
GO
EXEC sys.sp_configure N'optimize for ad hoc workloads', N'1'
GO
RECONFIGURE WITH OVERRIDE
GO
EXEC sys.sp_configure N'show advanced options', N'0'  RECONFIGURE WITH OVERRIDE
GO

Essentially, this option tells SQL Server to save a partial version of the plan known as the compiled plan stub. The stub occupies much less space than the entire plan.

As an alternative to this method, some people approach the issue rather brutally and flush the plan cache every now and then. Or, in a more careful way, flush “single-use plans” by using DBCC FREESYSTEMCACHE. Flushing the entire plan cache has its downsides, as you may already know.

Using Stored Procedures and Parameters

By using stored procedures, one can virtually eliminate the problem caused by Ad Hoc SQL. A stored procedure is compiled only once and the same plan is reused for subsequent executions of the same or similar SQL queries. When stored procedures are used to implement business logic, the key difference in the SQL queries that will be eventially executed by SQL Server lies in the parameters passed at execution time. Since the plan is already in place and ready for use, SQL Server will use the same plan no matter what parameter is passed.

Skewed Data

In certain scenarios, the data we are dealing with is not distributed evenly. We can demonstrate this – first, we will need to create a table:

--Create Table with Skewed Data
use Practice2017
go
create table Skewed (
ID int identity (1,1)
, FirstName varchar(50)
, LastName varchar(50)
, CountryCode char(2)
);

insert into Skewed values ('Kwaku','Amoako','GH')
go 10000
insert into Skewed values ('Kenneth','Igiri','NG')
go 10
insert into Skewed values ('Steve','Jones','US')
go 2

create clustered index CIX_ID on Skewed(ID);
create index IX_CountryCode on Skewed (CountryCode);

Our table contains data of club members from different countries. A large number of club members are from Ghana, while two other nations have ten and two members respectively. To keep focused on the agenda and for simplicity’s sake, I only used three countries and the same name for members coming from the same country country. Also, I added a clustered index in the ID column and a non-clustered index in the CountryCode column to demonstrate the effect of different execution plans for different values.

Fig. 2 Execution plans for two queries

When we query the table for records where CountryCode is NG and GH, we find that SQL Server uses two different execution plans in these cases. This happens because the expected number of rows for CountryCode=’NG’ is 10, while that for CountryCode=’GH’ is 10000. SQL Server determines the preferable execution plan based on table statistics. If the expected number of rows is high compared with the total number of rows in the table, SQL Server decides that it is better to simply do a full table scan rather than referring to an index. With a much smaller estimated number of rows, the index becomes useful.

Fig. 3 Estimated number of rows for CountryCode=’NG’

Fig. 4 Estimated number of Rows for CountryCode=’GH’

Enter Stored Procedures

We can create a stored procedure to fetch the records we want by using the very same query. The only difference this time is that we pass CountryCode as a parameter (see Listing 3). When doing this, we discover that the execution plan is the same no matter what parameter we pass. The Execution plan that will be used is determined by the execution plan returned at the first time the stored procedure is invoked. For example, If we run the procedure with CountryCode=’GH’ first, it will use a full table scan from that point on. If we then clear the procedure cache and run the procedure with CountryCode=’NG’ first, it will use index-based scans in future.

--Create a Stored Procedure to Fetch the Data
use Practice2017
go
select * from Skewed where CountryCode='NG';
select * from Skewed where CountryCode='GH';

create procedure FetchMembers 
(
@countrycode char(2)
)
as 
begin
select * from Skewed where CountryCode=@countrycode
end;


exec FetchMembers 'NG';
exec FetchMembers 'GH';

DBCC FREEPROCCACHE
exec FetchMembers 'GH';
exec FetchMembers 'NG';

Fig. 5 Index seek execution plan when ‘NG’ is used first

Fig. 6 Clustered index scan execution plan when ‘GH’ is used first

Execution of the stored procedure is behaving as designed – the required execution plan is used consistently. However, this can be an issue because one execution plan is not suited for all queries if the data is skewed. Using an index to retrieve a collection of rows almost as large as the entire table is not efficient – neither is using a full scan to retrieve only a small number of rows. This is the Parameter Sniffing problem.

Possible Solutions

One common way to manage the Parameter Sniffing problem is to deliberately invoke recompilation whenever the stored procedure is executed. This is much better than flushing the Plan Cache – except if you want to flush the cache of this specific SQL query, which is entirely possible. Take a look at an updated version of the stored procedure. This time, it uses OPTION (RECOMPILE) to manage the problem. Fig.6 shows us that, whenever the new stored procedure is executed, it uses a plan appropriate to the parameter we are passing.

--Create a New Stored Procedure to Fetch the Data
create procedure FetchMembers_Recompile
(
@countrycode char(2)
)
as 
begin
select * from Skewed where CountryCode=@countrycode OPTION (RECOMPILE)
end;

exec FetchMembers_Recompile 'GH';
exec FetchMembers_Recompile 'NG';

Fig. 7 Behaviour of the stored procedure with OPTION (RECOMPILE)

Conclusion

In this article, we have looked at how consistent execution plans for stored procedures can become a problem when the data we are dealing with is skewed. We have also demonstrated this in practice and learned about a common solution to the problem. I dare say this knowledge is invaluable to developers who use SQL Server. There are a number of other solutions to this problem – Brent Ozar went deeper into the subject and highlighted some more profound details and solutions at SQLDay Poland 2017. I have listed the corresponding link in the reference section.

References

Plan Cache and Optimizing for Adhoc Workloads

Identifying and Fixing Parameter Sniffing Issues

The post Parameter Sniffing Primer appeared first on {coding}Sight.

  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

In part two of the article, we will look at the remaining stages of Jenkins plugin implementation.
If you haven’t read the first part yet, please feel free to check it out before reading further.

Implementing the business logic (back-end)

Let’s look at how the business logic of a Jenkins plugin can be implemented by using the Build step implementation as an example. The approach will be similar for other step types (pre-build, post-build, publisher).

To implement any plugin that creates a Build step on the output, we will need to first implement a class that inherits the Builder abstract class. Builder encapsulates the basic logic of any build step.

public class MyStepBuilder extends Builder {

When implementing a plugin, the UI configuration part (Jelly file) and the Java file are directly related, so all required plugin parameters should be set via a constructor. To do this, mark the constructor with the @DataBoundConstructor annotation and pass all required parameters to it as arguments.

private final String param1, param2;
 
@DataBoundConstructor
public MyStepBuilder (String param1, String param2) 
{
  this.param1 = param1;
  this.param2 = param2;
}

Next, we’ll need to define the getters for the declared fields. This is done to make sure that the UI can get the necessary values in the case of us modifying an already created step configuration.

public String getParam1() 
{
  return param1;
}

Optional parameters are defined via the setters. For the UI to have access to them, they should be marked with the @DataBoundSetter annotation.

@DataBoundSetter
public void setOptionalParam (String optionalParam) {
  this.optionalParam = optionalParam;
}

Now, let’s look at an implementation of the perform method. The build step actions are executed directly within it:

@Override
public boolean perform(AbstractBuild<?, ?> build, Launcher launcher, BuildListener listener) 
{
	// some code
}

In this method, we have access to:

  • build- the entity that provides us with the build settings and various info about it.

  • launcher – the Jenkins executable environment entity.

  • listener – the entity that allows us to access the logger and control the build results.

We should also note a few additional redefinition methods that may prove useful. For example, the prebuild method can be used to provide additional environment validation before building. The getRequiredMonitorService() method is used for synchronization with other builds in the scope of the task. This can be useful in the process of integration with internal tools that don’t support parallel use. This method can return one of the following values:

  • BuildStepMonitor.BUILD – if the step requires the absence of unfinished builds

  • BuildStepMonitor.STEP – if the step requires the absence of similar unfinished steps in other builds

  • BuildStepMonitor.NONE – if synchronization is not required

Implementing validation

In addition to validating the environment before building, the prebuild method allows you to validate parameter values as they are being set. This is achieved by implementing a descriptor inside of your Builder class. The descriptor should implement the BuildStepDescriptor abstract class, and the @Extension annotation is used to connect it with the Builder class. For example:

@Extension
public static final class DescriptorImpl extends BuildStepDescriptor<Builder> {

Inside, you can set additional parameters for your step by redefining the methods. For example, DisplayText and Help can be set like this:

@Override
public String getDisplayName() 
{
    return "Name for our plugin";
}
 
@Override
public String getHelpFile() 
{
    return "/plugin/MyJenkinsPlugin/resources/io/jenkins/plugins/sample/help-myplugin.html";
}

To validate a specific field with the help of a descriptor, you will need to create a public method with the following signature:

FormValidation doCheck[FieldNameInCamelCase](@QueryParameter String value)

Inside, by implementing the logic of checking a specific field’s value, we can achieve validation directly at the moment when the user configures the step. For example:

public FormValidation doCheckParam1(@QueryParameter String value) 
{
  if (value.length() == 0)
    return FormValidation.error(Messages.MyBuilder_DescrImpl_errors_missingParam1());
  return FormValidation.ok();
}

This particular implementation of the doCheck method will lead to a warning message if the user leaves the param1 field empty. You won’t need to write any additional code in the Jelly file – Jenkins will connect the fields with the validator on its own.

Also, the descriptor contains many other methods (load, save, configure etc.). Redefining them can be helpful when implementing the build settings validation on the configuration stage.

JUnit test coverage

When you’re writing Jenkins plugins, it is implied that all developed steps will be extensively covered by JUnit tests. An example of such tests is provided with the sample plugin by Maven.

Two entities/moqs that are very useful in test coverage:

  • JenkinsRule – moq of the Jenkins executable environment

  • FreeStyleProject – moq of the Jenkins project

When using them, it is quite easy to test if the connection between the Jelly file and Java is configured properly:

@Rule
public JenkinsRule jenkins = new JenkinsRule();
 
private final String param1 = "param1", param2 = "param2";
 
@Test
public void testConfigRoundtrip() throws Exception 
{
  FreeStyleProject project = jenkins.createFreeStyleProject();
  project.getBuildersList().add(new MyStepBuilder(param1, param2));
  project = jenkins.configRoundtrip(project);
 
  MyStepBuilder myStepBuilder = new MyStepBuilder(param1, param2);
  jenkins.assertEqualDataBoundBeans(myStepBuilder, project.getBuildersList().get(0));
}
If the binding is incorrect, then the two following results are possible:
• An exception will occur (for example, the constructor accepts a different set of parameters)
• The object’s end state will not correspond to the reference state

The post Jenkins Plugin Implementation – Part 2 appeared first on {coding}Sight.

  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Statistics comprises lightweight objects that are used by SQL Server Query optimizer to determine the optimal way to retrieve data from the table. SQL Server optimizer uses the histogram of column statistics to choose the optimal query execution plan. If a query uses a predicate which already has statistics, the query optimizer can get all the required information from the statistics to determine the optimal way to execute the query. SQL Server creates statistics in two ways:

  1. When a new index is created on a column.
  2. If the AUTO_CREATE_STATISTICS option is enabled.

In this article, Auto Create Statistics and Auto Update Statistics options are analyzed. They are database specific and can be configured using SQL Server management studio and T-SQL Query. The following topics are covered in the paper:

  • A detailed explanation of Auto Create Statistics and Auto Update Statistics options.

  • Different approaches to enable Auto Create Statistics and Auto Update Statistics options.

  • A working example of Auto Create Statistics and Auto Update Statistics options.

  • Auto Create Statistics: Good or Bad.

Auto Create Statistics

When a user creates an index on a table, SQL Server optimizer by default creates statistics on the indexed column. If Auto Create Statistics option is enabled, SQL optimizer creates statistics on the nonindexed columns that are used in a query predicate. Optimizer creates statistics for individual columns that are used in a predicate. This statistics is created when a column does not have any statistics created on it previously. And it is used to generate an optimal query execution plan. The naming convention for the statistics created by Auto Create Statistics option is as follows: column_name + statistics_name in hexadecimal format. For example, _WA_SYS_00000003_01142BA1.

Auto Update Statistics

As the name suggests, when we enable Auto Update Statistics option, SQL Server automatically updates the statistics if it is outdated. Statistics is considered to be outdated if:

  1. Insert operation is performed on an empty table.
  2. A table had less than 500 rows when the statistics was generated, and the column modification counter was changed more than 500.
  3. A table had more than 500 rows when the statistics was generated, and the column modification counter is changed more than 500 + 20% of the number of rows in the table.
Different Approaches to Enable Auto Create Statistics and Auto Update Statistics

Now, we can enable Auto Update Statistics by the following methods.

Method 1: Using SQL Server Management Studio

To enable Auto Update Statistics, open SQL Server Management Studio, then in object explorer expand SQL Server instance and right-click the database which you want to enable Auto Update Statistics on. See the image below:

After that database Properties dialog window opens. In the dialog window, click Options. In the right pane, you will see the following options:

  • Auto Update Statistics
  • Auto Create Statistics.

See the image below:

To enable an option, click on drop down box, opposite to Auto Update Statistics or Auto Create Statistics option and select True. See the image below:

After you’ve selected TRUE, click OK to close the dialog box.

Method 2: Using T-SQL Script

We can also change Auto Create Statistics and Auto Update Statistics using T-SQL. To do that, open SQL Server management studio. Press Ctrl + N to open the new query editor window. In Query editor window run the query given below.

USE [master] 
go 

ALTER DATABASE [StackOverflow2013] 
SET auto_create_statistics ON 
go 

ALTER DATABASE [StackOverflow2013] 
SET auto_update_statistics ON WITH no_wait 
go

Once both options are enabled, run the following query to verify:

SELECT NAME, 
       CASE 
         WHEN is_auto_create_stats_on = 1 THEN 'Enabled.' 
         ELSE 'Disabled' 
       END AS 'Auto Creates statistics Status', 
       CASE 
         WHEN is_auto_update_stats_on = 1 THEN 'Enabled' 
         ELSE 'Disabled' 
       END AS 'Auto Update statistics Status' 
FROM   sys.databases 
WHERE  database_id > 4

The output is supposed to be as follows:

Demo Setup

In this demonstration, I am going to use StackOverFlow2013 database. It can be downloaded from here. Once you’ve extracted the files, please attach the database with SQL Server instance and create a new database named DemoDatabase. After that, create a table Users in DemoDatabase. To do that, run the following query:

USE [DEMODATABASE] 
GO 

CREATE TABLE [DBO].[USERS] 
  ( 
     [ID]             [INT] IDENTITY(1, 1) NOT NULL, 
     [ABOUTME]        [NVARCHAR](MAX) NULL, 
     [AGE]            [INT] NULL, 
     [CREATIONDATE]   [DATETIME] NOT NULL, 
     [DISPLAYNAME]    [NVARCHAR](40) NOT NULL, 
     [DOWNVOTES]      [INT] NOT NULL, 
     [EMAILHASH]      [NVARCHAR](40) NULL, 
     [LASTACCESSDATE] [DATETIME] NOT NULL, 
     [LOCATION]       [NVARCHAR](100) NULL, 
     [REPUTATION]     [INT] NOT NULL, 
     [UPVOTES]        [INT] NOT NULL, 
     [VIEWS]          [INT] NOT NULL, 
     [WEBSITEURL]     [NVARCHAR](200) NULL, 
     [ACCOUNTID]      [INT] NULL 
  ) 
GO

Once the table is created in DemoDatabase, we will import data from Users table which is in StackOverflow2013 database. To do that, execute the following code in StackOverflow2013 database:

INSERT INTO DEMODATABASE..USERS 
            (ID, 
             ABOUTME, 
             AGE, 
             CREATIONDATE, 
             DISPLAYNAME, 
             DOWNVOTES, 
             EMAILHASH, 
             LASTACCESSDATE, 
             LOCATION, 
             REPUTATION, 
             UPVOTES, 
             VIEWS, 
             WEBSITEURL, 
             ACCOUNTID) 
SELECT ID, 
       ABOUTME, 
       AGE, 
       CREATIONDATE, 
       DISPLAYNAME, 
       DOWNVOTES, 
       EMAILHASH, 
       LASTACCESSDATE, 
       LOCATION, 
       REPUTATION, 
       UPVOTES, 
       VIEWS, 
       WEBSITEURL, 
       ACCOUNTID 
FROM   USERS 
Output

(2465713 rows affected).

The query above will insert 2465713 rows in Users table created on DemoDatabase. Please note that we have not created any indexes or statistics on the Users table. To verify that, run the following query:

USE demodatabase 
go 
EXEC Sp_helpstats 
  'Users'

As neither indexes nor statistics have been created, the output will be as follows:

Now, AUTO_CREATE_STATISTICS option is enabled on Users table, meaning that SQL optimizer will create an index and a predicate on a column. Now run the following queries to check the behavior of a Query optimizer:

USE DEMODATABASE 
GO 

SELECT DISTINCT AGE 
FROM   USERS 
GO 

SELECT DISTINCT LOCATION 
FROM   USERS 
GO

As we have enabled AUTO_CREATE_STATISTICS on DemoDatabase, SQL server will automatically create statistics on Age and Location columns. To verify that, run the following query:

USE demodatabase 
go 
EXEC Sp_helpstats 
  'Users'

The output is supposed to be as follows:

As you can see in the image above, SQL Optimizer automatically creates statistics on Age and Location columns.

As I mentioned before, when Auto Update Statistics is enabled, SQL Server updates the statistics of the column if a table has more than 500 rows and if row modification counter has been changed more than 500 + 20% of the rows after the last statistics update.

Now, let’s consider the behavior of a query optimizer when Auto Update Statistics is enabled. Before we run our workload, first check the last statistics update time. To do that, Open SQL Server management studio >> Expand Database >> Expand Users table >> Right click Statistics. See the image below:

In Statistics Properties dialog window you can find the value for “Statistics for these columns was last updated.” In the picture given below, it is 5/29/2019 01:03:12 PM.

Alternatively, you can check the last time for statistics update by running the following query:

DBCC SHOW_STATISTICS(<Table Name>, StatisticsName) WITH STAT_HEADER

Now let’s run our workload to verify that statistics is being updated. Execute the following query:

INSERT INTO DEMODATABASE..USERS 
            (ID, 
             ABOUTME, 
             AGE, 
             CREATIONDATE, 
             DISPLAYNAME, 
             DOWNVOTES, 
             EMAILHASH, 
             LASTACCESSDATE, 
             LOCATION, 
             REPUTATION, 
             UPVOTES, 
             VIEWS, 
             WEBSITEURL, 
             ACCOUNTID) 
SELECT ID, 
       ABOUTME, 
       AGE, 
       CREATIONDATE, 
       DISPLAYNAME, 
       DOWNVOTES, 
       EMAILHASH, 
       LASTACCESSDATE, 
       LOCATION, 
       REPUTATION, 
       UPVOTES, 
       VIEWS, 
       WEBSITEURL, 
       ACCOUNTID 
FROM   USERS

Once the query is executed, check the statistics properties by running the following query:

DBCC SHOW_STATISTICS(‘USERS’, ‘_WA_SYS_00000003_01142BA1’) WITH STAT_HEADER 

The output is supposed to be as follows:

As you can see in the picture above, statistics has been updated after executing the workload on the Users table.

Auto Update Statistics: Good or Bad

The most important question is, “Should I enable Auto Update Statistics or not?” The answer is, “It depends on the workload and how the application is configured.”

As we know, outdated statistics causes many performance issues because SQL optimizer is not able to create an optimal path to retrieve data from a table; hence, statistics must be up to date. In that respect, Auto Update Statistics is always a good option, but the real problem is OLTP applications.

In the OLTP database, there are chances that Auto Update Statistics can reduce the performance because it is a resource-intensive operation, and it will impact other transactions. Similarly, if the application executes a bulk insert or update operation, update statistics can hurt the performance.

But in most cases, it is always advisable to enable auto update statistics.

Summary

In this article, the following topics are covered:

  1. A detailed explanation of Auto Create Statistics and Auto Update Statistics options.
  2. Different approaches to enable Auto create statistics and Auto Update Statistics options.
  3. A working example of Auto Create Statistics and Auto Update Statistics.
  4. Auto Create Statistics: Good or Bad.

The post Auto Create Statistics and Auto Update Statistics appeared first on {coding}Sight.

  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

The article is dedicated to the fundamentals of SQL Server Reporting Services (SSRS) development and aimed at beginners and professionals interested in database development.

A direct method in the form of a walkthrough is used to discuss the core concepts and their implementation in regards to SQL Server Reporting Services (SSRS).

The main focus of the article is to give the basic concepts of reports development rather than discuss the latest SQL Server Reporting Services (SSRS) versions and their features.

About SQL Server Reporting Services (SSRS)

First, let’s concentrate on a few important facts about SQL Server Reporting Services (SSRS) in the light of Microsoft documentation.

Microsoft Definition (SSRS 2016 and Later)

Considering SSRS 2016 and later, SQL Server Reporting Services (SSRS) provides a set of on-premises tools and services that are used to create, deploy, and manage mobile and paginated reports.

Simple Definition

SQL Server Reporting Services (SSRS) facilitates database reports development, deployment, and management.

In other words, SQL Server Reporting Services (SSRS) helps you quickly create, deploy, and manage database report(s).

Development Tools

An SSRS report can be created using one of the following tools:

  • SSDT (SQL Server Data Tools)
  • Report Builder
  • 3rd Party Report authoring tools (including dbForge SQL Server Report Builder).
Сhoice of a Tool

Despite the fact that the 3RD party report authoring tools offer out of the box features with fancy GUI support, in this article, our aim is not to choose the most convenient report building tool for beginners, but to choose a tool allowing to become familiar with the basics of reports development.

Report Server

Once you have developed the report using one of the development tools, you need to deploy your report to a server called a reporting server that is configured to match your requirements and host all the deployed reports in an organized way more like Windows folder style.

Report Manager

As the name implies, Report manager helps you manage your deployed reports in the form of a web-based portal.

SSRS Report Development

Next, let’s discuss the pre-requisites and steps to quickly create an SSRS report.

Pre-requisites

SSRS reports development assumes the following things:

  1. You can write and run basic T-SQL scripts
  2. You have basic understanding of SSDT (SQL Server Data Tools) or report builder
  3. You have a background in development or have exposure to T-SQL development.

Although it is not mandatory at this point, it is better if you have a readily available SSRS server configured to host your reports.

Report Development Steps

Please consider the following steps while building your SSRS reports when authoring reports using SQL Server Data Tools (SSDT):

  1. Create a new Report Server Project in SQL Server Data Tools (SSDT)
  2. Create a data source to be selected for your desired database
  3. Create a dataset which contains T-SQL to run behind the report
  4. Drag drop fields from the dataset to the report designer
  5. Test run the report
  6. Deploy the report (if you have configured a reporting server).
Setting up a Sample Database

First, set up a sample database which is going to be the data source for your new SSRS report.

You can set up a sample database called SQLDevBlogV4 by using the following script:

-- Create sample database (SQLDevBlogV4)
CREATE DATABASE SQLDevBlogV4;
GO


USE SQLDevBlogV4;

-- (1) Create Article table in the sample database
CREATE TABLE Article (
  ArticleId INT PRIMARY KEY IDENTITY (1, 1)
 ,Category	VARCHAR(50)
 ,Author VARCHAR(50)
 ,Title VARCHAR(150)
 ,Published DATETIME2
 ,Notes VARCHAR(400)  
)

GO

SET IDENTITY_INSERT [dbo].[Article] ON
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (1, N'Development', N'Atif', N'Introduction to T-SQL Programming ', N'2017-01-01 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (2, N'Testing', N'Peter', N'Database Unit Testing Fundamentals', N'2017-01-10 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (3, N'DLM', N'Sadaf', N'Database Lifecycle Management for beginners', N'2017-01-20 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (4, N'Development', N'Peter', N'Common Table Expressions (CTE)', N'2017-02-10 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (5, N'Testing', N'Sadaf', N'Manual Testing vs. Automated Testing', N'2017-03-20 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (6, N'Testing', N'Atif', N'Beyond Database Unit Testing', N'2017-11-10 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (7, N'Testing', N'Sadaf', N'Cross Database Unit Testing', N'2017-12-20 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (8, N'Development', N'Peter', N'SQLCMD - A Handy Utitliy for Developers', N'2018-01-10 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (9, N'Testing', N'Sadaf', N'Scripting and Testing Database for beginners ', N'2018-02-15 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (10, N'Development', N'Atif', N'Advanced Database Development Methods', N'2018-07-10 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (11, N'Testing', N'Sadaf', N'How to Write Unit Tests for your Database', N'2018-11-10 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (12, N'Development', N'Peter', N'Database Development using Modern Tools', N'2018-12-10 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (13, N'DLM', N'Atif', N'Designing, Developing and Deploying Database ', N'2019-01-01 00:00:00', NULL)
INSERT INTO [dbo].[Article] ([ArticleId], [Category], [Author], [Title], [Published], [Notes]) VALUES (14, N'DLM', N'Peter', N'How to Apply Database Lifecycle Management  ', N'2019-02-10 00:00:00', NULL)
SET IDENTITY_INSERT [dbo].[Article] OFF

Creating an SSRS Database Report

To create a new SSRS database report you need to create a new Report Server Project in SQL Server Data Tools (SSDT).

Creating a New Report Server Project

Open Visual Studio to create a new Report Server Project called Articles Report Project under Business Intelligence template provided you have installed SQL Server Data Tools (SSDT):

Adding a New Data Source

The first thing you need to do is to select the source of data for the report which is the sample database SQLDevBlogv4 in our case.

Right-click Shared Data Sources under Articles Report Project and Click Add New Data Source:

Connect to the required SQL instance, select sample database SQLDevBlogV4, and Click OK:

Name the data source DS_SQLDevBlogV4 and Click OK again:

Check the newly created data source:

Test Run Dataset Query

Open a new query window and run the following script against SQLDevBlogV4 to make sure that it pulls the desired data from the database:

-- Report dataset query to view all the articles
SELECT [ArticleId] ,
       [Category] ,
       [Author] ,
       [Title] ,
       [Published] ,
       [Notes]
FROM   dbo.Article;

Run the query to see the results:

Adding a New Dataset

Next, we are going to add a dataset in the form of T-SQL script to run behind the report.

Right-click Shared Datasets under project node and Click Add New Dataset:

Name the dataset DSet_SQLDevBlogV4 and add the query tested above in the input box:

Adding and Building a New Report

Right-click Reports under Articles Report Project node and Click Add New Report:

Skip the welcome screen by clicking Next button and then click Next again after making sure that your shared data source DS_SQLDevBlogV4 is already selected:

Write the same dataset query (which may seem to be an extra step) and Click Next:

Select (if not selected previously) Tabular report type and Click Next:

To design your report drag the fields from the list onto Displayed fields by Clicking Details and then Click Next:

Name the report ArticlesReport and Click Finish:

Go to Report Data and select Use a shared dataset under Datasets, then name the dataset Dset_Articles, Click Refresh Fields, and then OK:

Running the Report

Before running the report you need to do some formatting.

Click headers and detailed fields by holding down CTRL key and then using the toolbar align the text and the change the font and its size as follows:

Click Preview tab to run the report:

Congratulations! You have successfully created an SSRS report to show all the articles written by the authors.

Tasks to Do

Now that you are familiar with SSRS development fundamentals

  1. Keeping in mind the example given in this article, please create an SSRS report based on the sample database named TechnicalTraining mentioned in this article.
  2. Try to create a report to view duplicates based on the sample database mentioned in this article.

The post SSRS Reports Development in Simple Terms appeared first on {coding}Sight.

  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

In this article, I will demonstrate several ways to split the delimited string and insert it in into a column of a table in SQL Server. You can do it using the following methods:

  1. Convert delimited string into XML, use XQuery to split the string, and save it into the table.
  2. Create a user-defined table-valued function to split the string and insert it into the table.
  3. Split the string using STRING_SPLIT function and insert the output into a table.

To demonstrate the above methods, let me prepare a demo setup. First, let us create a table named Employee on DemoDatabase. To do that, we need to execute the following query:

USE DEMODATABASE 
GO 

CREATE TABLE EMPLOYEE 
  ( 
     ID            INT IDENTITY (1, 1), 
     EMPLOYEE_NAME VARCHAR(MAX) 
  )

For this demo, we will insert the names of all employees in one row and the names of employees will be separated by a comma. To do that, we need to execute the following query:

INSERT INTO EMPLOYEE 
            (EMPLOYEE_NAME) 
VALUES      ('DULCE , MARA , PHILIP , KATHLEEN, NEREIDA , GASTON , ETTA , EARLEAN , VINCENZA')

Execute the following query to verify that data has been inserted into the column.

SELECT * 
FROM   EMPLOYEE

The following is the output:

As I mentioned above, we are going split the delimited string and insert it into a table. So, we will create a table named Employee_Detail to store the delimited string split by any of the above methods.

To create a table, execute the following code:

USE DEMODATABASE 
GO 
CREATE TABLE EMPLOYEE_DETAIL 
  ( 
     ID      INT IDENTITY(1, 1) PRIMARY KEY CLUSTERED, 
     EMPNAME VARCHAR(MAX) NOT NULL 
  )

Method 1: Use STRING_SPLIT function to split the delimited string

We will use the STRING_SPLIT function to split the string in a column and insert it into a table. Before we do that, let me explain about the STRING_SPLIT function.

What is STRING_SPLIT Function

STRING_SPLIT is a table-valued function, introduced in SQL Server 2016. This function splits the string based on the special character within the row and returns the output in a separate table. We can use this function on the databases that have compatibility level equal to or higher than 130.

The STRING_SPLIT function accepts two parameters and returns a table with the separated values. The following is the syntax of the STRING_SPLIT function.

SELECT STRING_SPLIT (STRING, SPECIALCHARACTER)

In the above syntax, SPECIALCHARACTER is one character which will be used to separate the input string.

The following is a simple example of the STRING_SPLIT function.

DECLARE @STRING VARCHAR(MAX) 
DECLARE @SPECIALCHARACTER CHAR(1) 
SET @STRING='NISARG,NIRALI,RAMESH,SURESH' 
SELECT * 
FROM   STRING_SPLIT (@STRING, ',')

The following is an output of the query:

As you can see in the above example, the name of the output column returned by STRING_SPLIT is “value.” We can filter the output returned by the function using the WHERE clause on the “value” column and also, we can sort the order of output using the ORDER BY clause on the “value” column.

The following is an example.

Now to insert a delimited string into a table, we will perform the following tasks:

  1. Create a variable named @EmployeeName, which holds the output of the Employee table. To do that, execute the following code:
    DECLARE @EMPLOYEENAME VARCHAR(MAX) 
    SET @EMPLOYEENAME =(SELECT EMPLOYEE_NAME 
                        FROM   EMPLOYEE)
  2. Create another variable called @Separator of the char data type. This variable holds the value of the separator, which will be used to split the strings into multiple values. To create the variable and assign the value to the separator, execute the following code:
    DECLARE @SEPARATOR CHAR(1) 
    SET @SEPARATOR=','
  3. Now use the “STRING_SPLIT” function to split the values of the employee_name column of the Employee table and insert the values into the EMPLOYEENAME table. To do that, execute the following code:
    INSERT INTO EMPLOYEE_DETAIL 
                (EMPNAME) 
    SELECT * 
    FROM   STRING_SPLIT(@EMPLOYEENAME, @SEPARATOR)

The following is the entire script:

DECLARE @EMPLOYEENAME VARCHAR(MAX) 

SET @EMPLOYEENAME =(SELECT EMPLOYEE_NAME 
                    FROM   EMPLOYEE) 
DECLARE @SEPARATOR CHAR(1) 
SET @SEPARATOR=',' 
INSERT INTO EMPLOYEE_DETAIL 
            (EMPNAME) 
SELECT * 
FROM   STRING_SPLIT(@EMPLOYEENAME, @SEPARATOR)

Execute the above script. The script will insert nine rows into the table. Once you execute it, make sure the data has been inserted into the EMPLOYEENAME table. For this, execute the following query:

SELECT * 
FROM   EMPLOYEE_DETAIL

The following is the output:

Method 2: Split string using XML and insert the output in the table

When we want to split the delimited string, we can do it using table-valued functions. As we know, the user-defined table-valued functions are resource-intensive and should be avoided. In such cases, we do not have many options available. As I mentioned, the STRING_SPLIT function can be used for the databases which have compatibility level greater than or equal to 130. In such circumstances, it is difficult to find a way to split a delimited string. We have created a simple and efficient solution for this task. We can split the string using XML.

So, in this section, I am going to explain the code of XML which can be used to insert the split delimited string in different rows of a column.

I have split the entire code into three steps.

Step 1: Convert the delimited string into the XML Format. To do that, execute the following code:

USE demodatabase 
go 

DECLARE @xml       AS XML, 
        @QueryData AS VARCHAR(max), 
        @delimiter AS VARCHAR(10) 

SET @QueryData=(SELECT employee_name 
                FROM   employee) 
SET @delimiter =',' 
SET @xml = Cast(( '<EMPNAME>' 
                  + Replace(@QueryData, @delimiter, '</EMPNAME><EMPNAME>') 
                  + '</EMPNAME>' ) AS XML) 

SELECT @XML

The following is the output:

To view the entire XML string, click the cell as shown on the image above. Once you click the cell, the XML file should look like following:

<EMPNAME>DULCE </EMPNAME>
<EMPNAME> MARA </EMPNAME>
<EMPNAME> PHILIP </EMPNAME>
<EMPNAME> KATHLEEN</EMPNAME>
<EMPNAME> NEREIDA </EMPNAME>
<EMPNAME> GASTON </EMPNAME>
<EMPNAME> ETTA </EMPNAME>
<EMPNAME> EARLEAN </EMPNAME>
<EMPNAME> VINCENZA</EMPNAME>

Step 2: Once the string is converted into XML, use X-Query to query the XML file. To do that, execute the following code:

USE DEMODATABASE 
GO 

DECLARE @XML       AS XML, 
        @STR       AS VARCHAR(MAX), 
        @DELIMITER AS VARCHAR(10) 

SET @STR=(SELECT EMPLOYEE_NAME 
          FROM   EMPLOYEE) 
SET @DELIMITER =',' 
SET @XML = CAST(( '<EMPNAME>' 
                  + REPLACE(@STR, @DELIMITER, '</EMPNAME><EMPNAME>') 
                  + '</EMPNAME>' ) AS XML) 

SELECT N.VALUE('.', 'VARCHAR(10)') AS VALUE 
FROM   @XML.NODES('EMPNAME') AS T(N)

The following is the output:

Step 3: Insert the output generated by the query executed above into the Employee_Detail table. To do that, execute the following code:

USE DEMODATABASE
GO
DECLARE @XML AS XML,@STR AS VARCHAR(MAX),@DELIMITER AS VARCHAR(10)
SET @STR=(SELECT EMPLOYEE_NAME FROM EMPLOYEE)
SET @DELIMITER =','
SET @XML = CAST(('<EMPNAME>'+REPLACE(@STR,@DELIMITER ,'</EMPNAME><EMPNAME>')+'</EMPNAME>') AS XML)
INSERT INTO EMPLOYEE_DETAIL (EMPNAME)
SELECT N.VALUE('.', 'VARCHAR(10)') AS VALUE FROM @XML.NODES('EMPNAME') AS T(N)
/*Output
 (9 rows affected)
 */

Once data is inserted, execute the following script to verify that the data has been inserted. Execute the following query:

USE DEMODATABASE 
GO 
SELECT * 
FROM   EMPLOYEE_DETAIL

The following is the output.

Method 3: Split string using table-valued function and insert the output of the function in the table

This approach is traditional, and is supported in all versions and editions of SQL Server. In this approach, we will create a user-defined table-valued function which will use while loop and CHARINDEX and SUBSTRING function.

The following is the code to create a function:

REATE FUNCTION [DBO].SPLIT_DELIMITED_STRING (@SQLQUERY  VARCHAR(MAX), 
                                              @DELIMITOR CHAR(1)) 
RETURNS @RESULT TABLE( 
  VALUE VARCHAR(MAX)) 
AS 
  BEGIN 
      DECLARE @DELIMITORPOSITION INT = CHARINDEX(@DELIMITOR, @SQLQUERY), 
              @VALUE             VARCHAR(MAX), 
              @STARTPOSITION     INT = 1 

      IF @DELIMITORPOSITION = 0 
        BEGIN 
            INSERT INTO @RESULT 
            VALUES     (@SQLQUERY) 

            RETURN 
        END 

      SET @SQLQUERY = @SQLQUERY + @DELIMITOR 

      WHILE @DELIMITORPOSITION > 0 
        BEGIN 
            SET @VALUE = SUBSTRING(@SQLQUERY, @STARTPOSITION, 
                         @DELIMITORPOSITION - @STARTPOSITION) 

            IF( @VALUE <> '' ) 
              INSERT INTO @RESULT 
              VALUES     (@VALUE) 

            SET @STARTPOSITION = @DELIMITORPOSITION + 1 
            SET @DELIMITORPOSITION = CHARINDEX(@DELIMITOR, @SQLQUERY, 
                                     @STARTPOSITION) 
        END 

      RETURN 
  END

Once the function is created, execute the following query to split the query and insert the output into the Employee_Detail table.

DECLARE @SQLQUERY NVARCHAR(MAX) 
SET @SQLQUERY=(SELECT EMPLOYEE_NAME 
               FROM   EMPLOYEE) 
INSERT INTO EMPLOYEE_DETAIL 
SELECT * 
FROM   SPLIT_DELIMITED_STRING(@SQLQUERY, ',')

Once data is inserted into the table, execute the following query to verify that data has been inserted properly

Summary

In this article, I have covered:

  1. Different approach to split and insert the delimited string in table.
  2. High level is summary of STRING_SPLIT function.
  3. Split and insert a delimited string using XML and XQuery.
  4. Split and insert delimited string using a user-defined table-valued function.

The post Several Ways to Insert Split Delimited Strings in a Column appeared first on {coding}Sight.

Read for later

Articles marked as Favorite are saved for later viewing.
close
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Separate tags by commas
To access this feature, please upgrade your account.
Start your free month
Free Preview