Loading...

Follow SQLblog.com - The SQL Server blog spot on the web on Feedspot

Continue with Google
Continue with Facebook
or

Valid
My blog here has moved to sql.kiwi Please update your bookmarks and other links....(read more)
Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

The next 24 Hours of PASS will have the opportunity to celebrate the first 20 years of the international association of professionals and data management experts. PASS is the acronym of Professional Association of SQL Server, PASS was established on 1999 April 5th and since then, every year it provides training on the entire Microsoft Data Platform!

For those who do not know 24 Hours of PASS family of events: The event consists of a series of one-hour webinars. For this edition, the live sessions will start at 6:00 pm (UTC) on April 3rd 2019 and will continue for 24 hours in which international speakers will talk about past experiences, future visions and how data will evolve over the next 20 years!

The main topics will be:

  • Database modernization and migration
  • Future-proofing your data architecture
  • Data in a world of enhanced security and privacy
  • The impact of AI on our management and usage of data
  • Fundamentals of data management
  • Fundamentals of data architecture
  • Fundamentals of analytics

The complete list of sessions is available here.

Thanks to the Sponsors, the event will be completely free!

Don't miss 24 hours of training on Microsoft Data Platform, register now using this link.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
Data Quake. That's what it is. 
Dave Wells have just gave this great definition that clearly describes what's happening in the data management world during the recent years.
I am greatly enjoying Dave’s session today at Enterprise Data World summit and couldn't resist writing down the summary. 
Everything that we did in the last decade becomes wrong now. We have used to believe that application logic can run faster and do better if it sits inside the database layer. Now this architecture is being considered a wrong choice. Same goes for data normalization or strong schema. Some people even say that data warehouses are dead. 
We need to rethink everything. Data schema used to be defined during the design phase. Now we define schema-on-read, after the data have been persisted. Good news - I have always believed that and Dave have just mentioned - there is no schema-less data. Despite the fact that we do not get to design the schema anymore, for Big Data we need to understand the schema from the existing data and define a separate schema models for each use case. Same goes to Data Quality rules that aren't generic anymore. Need to figure out the data quality rules for each use case. Even data governance is changing. We cannot govern the data anymore. It's getting out of the boundaries, out of the control. We can only govern what people do with the data.
In the modern data world we have more data sources and more types of data, we have much more ways to organize and store data, more uses for data and more data consumers that require fast and on-demand data delivery.
Data management world is changing. 
The age of Data Warehousing and BI have came to an end. 
We are now in the age of Big Data and Data Lakes but they are slowly going away as well.
And our future is approaching fast, bringing with it Data Catalos, Data Hubs and Data Fabric concepts.
I am myself on the way to figure out what is it and how it all makes sense.
What a great speaker, awesome session and tons of learning ahead of me.

Yours,
Maria


Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

More and more companies are aiming to move away from managing their own servers and moving towards a cloud platform. Going serverless offers a lot of benefits like lower administrative overhead and server costs. In the serverless architecture, developers work with event driven functions which are being managed by cloud services. Such architecture is highly scalable and boosts developer productivity.

AWS Glue service is an ETL service that utilizes a fully managed Apache Spark environment. Glue ETL that can clean, enrich your data and load it to common database engines inside AWS cloud (EC2 instances or Relational Database Service) or put the file to S3 storage in a great variety of formats, including PARQUET.

I have recently published 3 blogposts on how to use AWS Glue service when you want to load data into SQL Server hosted on AWS cloud platform.

 

1. Serverless ETL using AWS Glue for RDS databases

2. Join and Import JSON files from s3 to SQL Server RDS instance Part 1 and Part 2

 

Are you using AWS Glue? What do you think about serverless ETL ? 

Yours

Maria

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

The Database Migration Team of the SQL Server Product Group has created the following tools and services to facilitate migration between different versions of SQL Server or between on-premises and SQL Azure DB in the cloud (but not only)!

Azure Database Migration Service (Azure DMS)

Designed as a seamless, end-to-end solution for moving on-premises SQL Server databases to the cloud.

https://aka.ms/AzureDMS

Database Migration Guide

Review all your options for an upgrade from on-premise SQL Server versions to Azure SQL Database.

https://aka.ms/datamigration

Azure Migrate

Easily discover, assess and migrate your on-premises virtual machine to Azure.

https://azure.microsoft.com/en-us/services/azure-migrate/

Azure Data Factory (ADF)

Create, schedule and manage your data integration at scale with our hybrid data integration (ETL) service.

https://aka.ms/adf

Data Migration Assistant (DMA)

Prepare your upgrade to a modern data platform by reviewing compatibility with your new SQL Server.

https://aka.ms/dma

SQL Server Migration Assistant (SSMA)

Automate database migrations to SQL Server from Microsoft Access, DB2, MySQL, Oracle, and SAP ASE.

https://aka.ms/get-ssma

Database Experimentation Assistant (DEA)

An A/B testing solution for changes in SQL Server environments, such as upgrades or new indexes.

https://aka.ms/dea-tool

SQL Server Data Tools (SSDT)

A modern development tool for SQL Server databases, Azure SQL databases, RS data models and IS packages.

https://docs.microsoft.com/en-us/sql/ssdt

Microsoft Data Migration Blog

The official team web log for Azure Data Magnet’s Database Migration Team

https://aka.ms/dm_blog 

If you are considering the migration to SQL Azure Database, I suggest you read this white paper: Choosing your database migration path to Azure.

For more information, you can follow the Data Migration Team YouTube channel, you will find videos and tutorials on SQL Server Data Migration Tools.

This opportunity is perfect to remind you that the next July 9, 2019 SQL Server 2008 and SQL Server 2008 R2 will exit from the maintenance program, Microsoft will not release updates on these two products, not even security updates. You can find all details here. This is the time to migrate!

You can also provide your feedback on Data Migration Tools and Services by writing to @Data_Migrations (datamigration at microsoft dot com).

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Another SQL Saturday has been scheduled in Pordenone (Italy), it will be the SQL Saturday #829, as last year it will be an international event with international speakers and some sessions in English language!

SQL Saturday Pordenone 2019 will be on february, saturday 23 at Consorzio Universitario di Pordenone, Via Prasecco 3/a, Pordenone (Italy).

The agenda of the day is divided in five track (one more than last year) that will deliver a total of 30 hours of free training on SQL Server, PowerBI/Visualization, Cloud, Analytics, Enterprise Engine and DevOps/Development! You will learn more about topics that you use every day rather than learn something about technologies that you don't use yet.

Thanks to our Sponsors, the event will be free of charge for you, but registration is mandatory.

If you are around Pordenone at the end of february, or if you want to come in Italy for a weekend of training on Microsoft Data Platform with #sqlfamily friends and good food :) you are welcome!

Registration is available here and the official twitter hashtag is #sqlsat829.

See you in Pordenone!

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Have you ever wondered how MySQL and PostgreSQL are different? Take a look at the mapping that I have made for myself: https://www.mssqltips.com/sqlservertip/5745/compare-sql-server-mysql-and-postgresql-features/

I would be happy to see comments posted for this blog and add more things to the mapping.

Yours

Maria 

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

.

Many companies these days keep their data assets in multiple data stores.  Many companies that I have worked at have used other database systems alongside SQL Server, such as PostgreSQL instances, Redis, Elasticsearch or Couchbase. There are situations when the application, that uses SQL Server as their main database, needs to access data from another database system. Some datastores have ODBC/JDBC drivers so you can easily add a linked server. Some datastores do not have ODBC/JDBC drivers.

Want to learn how to to access noSQL platforms with SQL Server and Python? Read my article here : https://www.mssqltips.com/sqlservertip/5738/discover-how-sql-server-can-use-python-to-access-any-nosql-engine/

Happy NoSQLing

Maria 

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

I had to investigate today situation when my AWS Aurora PostgreSQL instance CPU was 100%.

I have started to search for a root cause at the database statistic views level. There is a view pg_stat_activity which shows information related to the current activity of each process, such as host ip, last transaction starttime and waitstats information. There were no long running transactions or long waits and unfortunately this view does not have any counters on CPU or memory usage per process. Boom.

Another way to track down performance issues is to use extension pg_stat_statements to see execution statistics. This view rows are per query and provide information about how many times query was executed, query execution time, number of rows retrieved etc. Again, no counters related to memory or cpu usage.

Some use the below query (source here) that is based on query total_time, assuming that the query that runs longer uses more cpu.

SELECT substring(query, 1, 50) as query, round(total_time::numeric, 2) AS total_time, calls, 

rows, round(total_time::numeric / calls, 2) AS avg_time, 

round((100 * total_time / sum(total_time::numeric) OVER ())::numeric, 2) AS percentage_cpu 

FROM pg_stat_statements 

ORDER BY total_time 

DESC LIMIT 10;

I am not sure it’s true in all cases but this might point us to the root cause.

I could have used plperlu extension that can show percentage of CPU and memory used by particular session but it is not supported by AWS RDS.


Postgresql is a process-based system, it starts new process for each database connection. This is why you can see database connection memory and cpu usage using OS facilities only.

If we are using RDS, we have no access to OS level and cannot run top to take a look at all processes.

Today I have discovered advanced monitoring for RDS instances to monitor OS processes:

After enabling this monitoring, we can chose OS process list in the below drop down:

Using the above, you can monitor all PostgreSQL processes and their resource consumption !

Pid 6723 had some sensitive information and I had to clean it. The above screenshot was taken after the CPU peak was over which is why the numbers are low.

Now I can go back to pg_stat_activity and check which host and which application is using the specific connection and see executed queries and the waitstats:

select * from pg_stat_activity where pid = 6723

Unfortunately pg_stat_activity does not show an active statement but only the top-level one. And there is no way to join between pg_stat_activity and pg_stat_statements to match pid of the connection with query history that we can see in pg_stat_statements.

We are halfway through our problem. We now know which processes are consuming CPU but do not really know which queries they have executed.

However, since we have the hostname and the application that stand behind problematic connections, we can go to the application developers, check together the query patterns that they execute and try to understand why their connections are so CPU intensive.

I would appreciate hearing your thoughts on this

Yours

Maria

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

From Redgate website you can download (for free) the ebook SQL Server Execution Plans written by Grant Fritchey and technically reviewed by Hugo Kornelis.

This ebook can not be miss in your digital library!

Enjoy! 

Read Full Article

Read for later

Articles marked as Favorite are saved for later viewing.
close
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Separate tags by commas
To access this feature, please upgrade your account.
Start your free month
Free Preview