/*****by Jiangong SUN******/
Update: 16/05/2013, 17/05/2013, 28/06/2013, 15/07/2013
PART III: Entity framework
2.3 Diagrams (shape and connector positions)
Structure of edmx file:
3) What is the difference between .SDF and .MDF file ?
Answer:
.sdf file is for SQL Server Compact Version. It's popular in Windows phone for storing some data (small count of data), also it could be used in web sites, but it has poor performance, even SQLite is better.
.mdf file is for SQL Server Complete Version. It's used in big sites with a lot of data. Its more powerful, but cant be used in phones and must be installed on servers.
SQL Server Compact 4.0:
Individual Database file size is limited to 4 GB.
Run in-process (Embedded) with an application.
No FileStream support
Limited to 256 concurrent connection.
SQL Express 2008 R2:
Individual Database File size is limited to 10 GB
Excluding user instance scenario it can be used as a standalone database
Filestream and CLR Support.
No limit on number of simultaneous connection.
4) What is stored procedure?
Answer:
Store procedure stores the T-SQL code in sql server, who pre-compile the T-SQL code.
Performance: reduce network traffic and enhance execution plan re-use
stored procedures can be used to reduce network traffic. You only have to send the EXECUTE stored_proc_name statement over the wire instead of a whole T-SQL routine, which can be pretty extensive for complex operations.
stored procedures allows you to enhance execution plan re-use, and thereby improve performance, by using remote procedure calls (RPCs) to process the stored procedure on the server. When you use a SqlCommand.CommandType of StoredProcedure, the stored procedure is executed via RPC. The way RPC marshals parameters and calls the procedure on the server side makes it easier for the engine to find the matching execution plan and simply plug in the updated parameter values.
Maintenability :Provide a single point of maintenance
In a perfect world, your database schema would never change and your business rules would never get modified, but in the real world these things happen. That being the case, it may be easier for you if you canmodify a stored procedure to include data from the new X, Y, and Z tables that have been added to support that new sales initiative, instead of changing that information somewhere in your application code.
Security:
In terms of regulating user access to information, they can provide access to specific data by allowing users permissions on the stored procedure, but not the underlying tables. You can think of stored procedures as similar to SQL Server views (if you are familiar with those), except the stored procedure accepts input from the user to dynamically change the data displayed.
The cached execution plan used to give stored procedures a performance advantage over queries. However, for the last couple of versions of SQL Server, execution plans are cached for all T-SQL batches, regardless of whether or not they are in a stored procedure. Therefore, performance based on this feature is no longer a selling point for stored procedures.
reference:
http://msdn.microsoft.com/en-us/library/ms973918.aspx
5) How to create a trigger ?
Answer:
Trigger can work with AFTER/FOR/INSTEAD OF with INSERT, UPDATE, DELETE operations.
MSSQL does not support BEFORE triggers.
Syntax:
CREATE TRIGGER trigger_name
ON { table | view }
[ WITH ENCRYPTION ]
{
{ { FOR | AFTER | INSTEAD OF } { [ INSERT ] [ , ] [ UPDATE ] [ , ] [ DELETE ] }
[ WITH APPEND ]
[ NOT FOR REPLICATION ]
AS
[ { IF UPDATE ( column )
[ { AND | OR } UPDATE ( column ) ]
[ ...n ]
| IF ( COLUMNS_UPDATED ( ) { bitwise_operator } updated_bitmask )
{ comparison_operator } column_bitmask [ ...n ]
} ]
sql_statement [ ...n ]
}
}
Disadvantages(Problems) of Triggers:
- It is easy to view table relationships , constraints, indexes, stored procedure in database buttriggers are difficult to view.
- Triggers execute invisible to client-application application. They are not visible or can be traced in debugging code.
- It is hard to follow their logic as it they can be fired before or after the database insert/update happens.
- It is easy to forget about triggers and if there is no documentation it will be difficult to figure out for new developers for their existence.
- Triggers run every time when the database fields are updated and it isoverheadon system. It makes system run slower.
Reference:
http://blog.sqlauthority.com/2007/05/24/sql-server-disadvantages-problems-of-triggers/
Here is an example:
ALTER TRIGGER dbo.TriggerName ON dbo.TableName AFTER INSERT AS BEGIN SET NOCOUNT ON; DECLARE @ID int; SELECT @ID = ColumnID FROM INSERTED UPDATE dbo.TableName.CreateDate = GETDATE(); END;
reference:
http://stackoverflow.com/questions/11131540/sql-server-create-triggers-on-insert-and-update
6) What is a View ?
Answer:
View is a "virtual table"
First, simple views are expanded in place and so do not directly contribute to performance improvements - that much is true.
However, indexed views can dramatically improve performance.
A view contains rows and columns, just like a real table. The fields in a view are fieldsfrom one or more real tables in the database.
You can add SQL functions, WHERE, and JOIN statements to a view and present the data as if the data were coming from one single table.
Example:
CREATE VIEW viewName AS SELECT columns FROM tables WHERE conditions
reference:
http://stackoverflow.com/questions/439056/is-a-view-faster-than-a-simple-query
Answer:
Multicolumn unique indexes guarantee that each combination of values in the index key is unique. For example, if a unique index is created on a combination of LastName, FirstName, and MiddleName columns, no two rows in the table could have the same combination of values for these columns.
Example:
CREATE UNIQUE INDEX indexName ON table(columns) WITH (IGNORE_DUP_KEY = OFF);
8) What is an indexed view ?
Answer:
CREATE TABLE wide_tbl(a int PRIMARY KEY, b int, ..., z int) CREATE VIEW v_abc WITH SCHEMABINDING AS SELECT a, b, c FROM dbo.wide_tbl WHERE a BETWEEN 0 AND 1000 CREATE UNIQUE CLUSTERED INDEX i_abc ON v_abc(a)
9) What is the difference between clustered index and non-clustered index ?
Answer:
A clustered index alters the way that the rows are stored. When you create a clustered index on a column (or a number of columns), SQL server sorts the table’s rows by that column(s). It is like a dictionary, where all words are sorted in alphabetical order in the entire book. Since it alters the physical storage of the table, only one clustered index can be created per table. In the above example the entire rows are sorted by computer_id since a clustered index on computer_id column has been created.
CREATE CLUSTERED INDEX [IX_CLUSTERED_COMPUTER_ID] ON [dbo].[nics] ([computer_id] ASC)
A table without a clustered-index is called a “heap table”. A heap table has no sorted data thus SQL server has to scan the entire table in order to locate the data in a process called a “scan”.
CREATE NONCLUSTERED INDEX [IX_NONCLUSTERED_COMPUTER_ID] ON [dbo].[nics] ([computer_id] ASC)
Clustered index vs non-clustered index:
Clustered Index
Only one per table
Faster to read than non clustered as data is physically stored in index order
Non Clustered Index
Can be used many times per table
Quicker for insert and update operations than a clustered index
reference:
http://www.itbully.com/articles/sql-indexing-and-performance-part-2-clustered-and-non-clustered
http://stackoverflow.com/questions/91688/what-are-the-differences-between-a-clustered-and-a-non-clustered-index
10) What is query optimizer, query processor ?
Answer:
At the core of the SQL Server Database Engine are two major components: the Storage Engine and the Query Processor, also called the Relational Engine. The Storage Engine is responsible for reading data between the disk and memory in a manner that optimizes concurrency while maintaining data integrity. The Query Processor, as the name suggests, accepts all queries submitted to SQL Server, devises a plan for their optimal execution, and then executes the plan and delivers the required results.
Query processor's job:
for each query SQL Server receives, the first job of the query processor is to devise a plan, as quickly as possible, which describes the best possible way to execute said query (or, at the very least, an efficient way). Its second job is to execute the query according to that plan.
Each of these tasks is delegated to a separate component within the query processor; theQuery Optimizer devises the plan and thenpasses it along to the Execution Engine, which will actuallyexecute the plan and get the results from the database.
Query processor's working steps:
SQL Statement ----> Parsing ----> Binding ----> Query Optimization ----> Query Execution----> Query Results
Parsing makes sure that the T-SQL query has a valid syntax, and translates the SQL query into an initial tree representation: specifically, a tree of logical operators representing the high-level steps required to execute the query in question.
Binding is mostly concerned with name resolution. During the binding operation, SQL Server makes sure that all the object names do exist, and associates every table and column name on the parse tree with their corresponding object in the system catalog. The output of this second process is called analgebrized tree, which is then sent to the Query Optimizer.
The next step is the optimization process, which is basicallythe generation of candidate execution plans and theselection of the best of these plans according to their cost. As has already been mentioned, SQL Server uses a cost-based optimizer, and uses a cost estimation model to estimate the cost of each of the candidate plans.
reference:
https://www.simple-talk.com/sql/sql-training/the-sql-server-query-optimizer/
http://www.itbully.com/articles/sql-indexing-and-performance-part-3-queries-indexes-and-query-optimizer
11) How to optimize SQL server's performance?
Answer:
1> Apply proper indexing in the table columns in the database
2> Move TSQL code from the application into the database server
2.1 move T-SQL to Stored procedures, Views, Functions and Triggers
2.2 use T-SQL best practices
Don't use "SELECT*" in a SQL query
Avoid unnecessary columns in the SELECT list and unnecessary tables in join conditions
Do not use the COUNT() aggregate in a subquery to do an existence check
Try to avoid joining between two types of columns
Try to avoid deadlocks
Write TSQL using "Set based approach" rather than "Procedural approach" (avoid using Cursor for large range search results)
Try not to use COUNT(*) to obtain the record count in a table
Try to avoid dynamic SQL
Try to avoid the use of temporary tables
Instead of LIKE search, use full text search for searching textual data
Try to use UNION to implement an "OR" operation
Implement a lazy loading strategy for large objects
Implement the following good practices in Stored Procedures(don't use SP_XXX, it can make conflicts with other applications. Try :[App]_[Object]_[Action][Process])
Implement the following good practices in Triggers (avoid of using triggers, it's costly)
Use views for re-using complex TSQL blocks, and to enable it for indexed views
3>Diagnose performance problems, and use SQL Profiler and the Performance Monitoring Tool effectively
reference:
http://www.codeproject.com/Articles/34372/Top-10-steps-to-optimize-data-access-in-SQL-Server
12) Transaction?
Answer:
USE AdventureWorks; GO BEGIN TRANSACTION; BEGIN TRY -- Generate a constraint violation error. DELETE FROM Production.Product WHERE ProductID = 980; END TRY BEGIN CATCH SELECT ERROR_NUMBER() AS ErrorNumber ,ERROR_SEVERITY() AS ErrorSeverity ,ERROR_STATE() AS ErrorState ,ERROR_PROCEDURE() AS ErrorProcedure ,ERROR_LINE() AS ErrorLine ,ERROR_MESSAGE() AS ErrorMessage; IF @@TRANCOUNT > 0 ROLLBACK TRANSACTION; END CATCH; IF @@TRANCOUNT > 0 COMMIT TRANSACTION; GO
13) Full text search
Answer:
Full-Text Search in SQL Server lets users and applications run full-text queries against character-based data in SQL Server tables. Before you can run full-text queries on a table, the databaseadministrator must create a full-text index on the table. The full-text index includes one or more character-based columns in the table. These columns can have any of the following data types: char, varchar, nchar, nvarchar, text, ntext, image, xml, or varbinary(max) and FILESTREAM. Each full-text index indexes one or more columns from the table, and each column can use a specific language.
Creation Full text index Steps:
- Create a Full-Text Catalog
- Create a Full-Text Index
- Populate the Index
Full text Search:
FREETEXT( ) Is predicate used to search columns containing character-based data types. It will not match the exact word, but the meaning of the words in the search condition. When FREETEXT is used, the full-text query engine internally performs the following actions on the freetext_string, assigns each term a weight, and then finds the matches.
Separates the string into individual words based on word boundaries (word-breaking).
Generates inflectional forms of the words (stemming).
Identifies a list of expansions or replacements for the terms based on matches in the thesaurus.
CONTAINS( ) is similar to the Freetext but with the difference that it takes one keyword to match with the records, and if we want to combine other words as well in the search then we need to provide the“and” or “or” in search else it will throw an error.
USE AdventureWorks2008 GO SELECT BusinessEntityID, JobTitle FROM HumanResources.Employee WHERE FREETEXT(*, 'Marketing Assistant'); SELECT BusinessEntityID,JobTitle FROM HumanResources.Employee WHERE CONTAINS(JobTitle, 'Marketing OR Assistant'); SELECT BusinessEntityID,JobTitle FROM HumanResources.Employee WHERE CONTAINS(JobTitle, 'Marketing AND Assistant'); GO
http://blog.sqlauthority.com/2008/09/05/sql-server-creating-full-text-catalog-and-index/
14) SQL practices
1>There are 3 tables, Employee, WorkAt, Company. I need to get the employees who worked at at least 2 companies.
SELECT e.ID, e.FirstName, e.LastName, count(id) as occurance FROM Employee e, WorkAt wa INNER JOIN e.ID = wa.employeeID WHERE wa.StartDate BETWEEN '2000-01-01' AND '2013-05-17' GROUP BY e.ID, e.FirstName, e.LastName HAVING occurance >= 2;
2> Get all differences between two tables
SELECT A.*, B.* FROM A FULL JOIN B ON (A.C = B.C) WHERE A.C IS NULL OR B.C IS NULL
SELECT A.*, B.* FROM A LEFT JOIN B ON (A.C = B.C) WHERE B.C IS NULL;
15) Database ACID
Atomicity, Consistency, Isolation, Durability
Atomicity states that database modifications must follow an “all or nothing” rule. Each transaction is said to be “atomic.” If one part of the transaction fails, the entire transaction fails. It is critical that the database management system maintain the atomic nature of transactions in spite of any DBMS, operating system or hardware failure.
Consistency states that only valid data will be written to the database. If, for some reason, a transaction is executed that violates the database’s consistency rules, the entire transaction will be rolled back and the database will be restored to a state consistent with those rules. On the other hand, if a transaction successfully executes, it will take the database from one state that is consistent with the rules to another state that is also consistent with the rules.
Isolation requires that multiple transactions occurring at the same time not impact each other’s execution. For example, if Joe issues a transaction against a database at the same time that Mary issues a different transaction, both transactions should operate on the database in an isolated manner. The database should either perform Joe’s entire transaction before executing Mary’s or vice-versa. This prevents Joe’s transaction from reading intermediate data produced as a side effect of part of Mary’s transaction that will not eventually be committed to the database. Note that the isolation property does not ensure which transaction will execute first, merely that they will not interfere with each other.
Durability ensures that any transaction committed to the database will not be lost. Durability is ensured through the use of database backups and transaction logs that facilitate the restoration of committed transactions in spite of any subsequent software or hardware failures.
16) MongoDB
MongoDB is a NoSQL database, who stores key/value as its data.
17) Truncate vs. Delete from
TRUNCATE TABLE is a statement that quickly deletes all records in a table by deallocating the data pages used by the table. This reduces the resource overhead of logging the deletions, as well as the number of locks acquired; however, it bypasses the transaction log, and the only record of the truncation in the transaction logs is the page deallocation. Records removed by the TRUNCATE TABLE statement cannot be restored. You cannot specify a WHERE clause in a TRUNCATE TABLE statement-it is all or nothing. The advantage to using TRUNCATE TABLE is that in addition to removing all rows from the table it resets the IDENTITY back to the SEED, and the deallocated pages are returned to the system for use in other areas.
In addition, TRUNCATE TABLE statements cannot be used for tables involved in replication or log shipping, since both depend on the transaction log to keep remote databases consistent.
TRUNCATE TABLE cannot used be used when a foreign key references the table to be truncated, since TRUNCATE statements do not fire triggers. This could result in inconsistent data because ON DELETE/UPDATE triggers would not fire. If all table rows need to be deleted and there is a foreign key referencing the table, you must drop the index and recreate it. If a TRUNCATE TABLE statement is issued against a table that has foreign key references, the following error is returned:
DELETE TABLE statements delete rows one at a time, logging each row in the transaction log, as well as maintaining log sequence number (LSN) information. Although this consumes more database resources and locks, these transactions can be rolled back if necessary. You can also specify a WHERE clause to narrow down the rows to be deleted. When you delete a large number of rows using a DELETE FROM statement, the table may hang on to the empty pages requiring manual release using DBCC SHRINKDATABASE (db_name).
reference:
http://www.mssqltips.com/sqlservertip/1080/deleting-data-in-sql-server-with-truncate-vs-delete-commands
http://www.sqlservergeeks.com/blogs/RakeshMishra/sql-server-bi/76/sql-server-delete-vs-truncate
18) Linq to Entities vs. Linq to SQL
Linq-To-Sql - use this framework if you plan on editing a one-to-one relationship of your data in your presentation layer. Meaning you don't plan on combining data from more than one table in any one view or page.
Entity Framework - use this framework if you plan on combining data from more than one table in your view or page. To make this clearer, the above terms are specific to data that will be manipulated in your view or page, not just displayed. This is important to understand.
Performance comparison:
http://toomanylayers.blogspot.fr/2009/01/entity-framework-and-linq-to-sql.html
Read time: DataReader < DataSet < Linq to SQL < Linq to Entites