[T-SQL Tuesday] Data modeling: The trouble with prefixes

T-SQL TuesdayThis month, Mickey Stuewe (b|t) hosted T-SQL Tuesday, giving us a topic of “data modeling gone wrong“. This is a fantastic topic – there are just so many ways that data modeling can go wrong, even by experienced pros – that I wanted to chime in, even though I wasn’t able to get a post written by the deadline for that event.

Almost every company (and every programmer) has certain conventions that they use when developing programs. For the most part, they serve to streamline the process and make it easier for teams to maintain each others’ code. Hungarian notation is frequently a big part of that. While you may not know it by that name, we’ve all seen it: it’s the prefix, often in lowercase letters, that programmers attach to the beginning of the name of an object to denote its type or function. A couple of examples that I use frequently are “fn” to denote a function or “wk” to denote a variable that I use to store intermediate, working values. Useful, right?

Well, most of the time. Just like any good thing, Hungarian notation can be taken too far. Suppose you have a column in your database that stores US zip codes:

intZip int NOT NULL

Looks easy enough. A US zip code is 5 or 9 numeric digits. We can deal with the dash that a 9-digit zip code uses by masking it when it’s output. No problem.

… Except when your company decides to start doing business in Canada. Canada uses 6-character alphanumeric postal codes. The misnomer of “zip code” versus “postal code” is easy enough to ignore, but that prefix says that the value is an integer. Now your schema has a column like this:

intZip varchar(9) NOT NULL

and a query might resemble this:

FROM customers
WHERE intZip = 'M4B1B3';

Hmmm… The prefix isn’t really helpful anymore, is it? It’s actually counterproductive like this: your code states that the value is an integer (people apply meaning to the name, after all – that’s why you used it), but the datatype is a varchar. You either have to live with it or you have to refactor every piece of code that contains a zip code. Most of us will let the discrepancy go at first. It’s only one column, after all. But think about where this leads. One field might not matter so much, but what about the next time a type change is required? And the next? And the next? We change code all the time.

I worked with a system several years ago that was in this state. They had used Hungarian notation to denote the type of the column, not realizing the inevitable consequences. It was a mess. Strings had become integers; integers had become strings; strings had become dates; and so on. They had comments everywhere to remind themselves of what the actual type of the column was, and one of the first steps in every project was to confirm the types of the columns they needed to work with. Every change took them far longer than it would have if they hadn’t used the prefixes to denote the type.

Prefixes can be quite useful. They help make our code self-documenting, and they allow us to more easily distinguish similarly-named objects within a namespace. I don’t code without them. But use them the right way: only use a prefix if you’re certain that the characteristic the prefix describes will never change. And if when it does? Take the time to do the refactoring. Your coworkers will thank you later.

This is my post for week 3 of the November 2015 SQL New Blogger Challenge. Check out all of this week’s posts on Twitter (#SQLNewBlogger).

SQLNewBlogger, Week 3

I have a confession to make: I haven’t been writing much this week.

Last weekend, I didn’t feel well and didn’t do much of anything. This week, I’ve been buried in work and didn’t have time. Because of those, I missed Tuesday’s post, and I don’t have a writing prompt or a recap ready for this coming week’s SQL New Blogger Challenge post. Life has gotten in the way of writing.

This is what happened the last time I stopped. And the time before that. It can be intimidating to start writing again, especially in a public forum.

I had a plan to write a piece about a situation I recently encountered at a client site. It won’t be finished before Tuesday. But you know what? That’s OK. It’ll still be a good post another day.

For today, for this week, I’ll just write. And I hope that you’ll do the same.

SQL New Blogger Challenge: Week 1 recap

Thanks to everyone who participated in week 1 of the SQL New Blogger Challenge! Eight posts were written this week and linked on Twitter (#SQLNewBlogger), and I think I saw some that weren’t tweeted with our hashtag. That’s a great first week!

Andy Levy (b|t) once again generated a digest of the posts for this week. (Thanks, Andy!) Jump to his site (linked below) to check out the list, then visit the bloggers’ posts.

Now, go and write your posts for week 2! (And check out Steve Jones’ post from yesterday if you need more ideas.)

SQL New Blogger Challenge, November Edition, Week 1 Digest  [The Rest is Just Code]

SQL New Blogger Challenge: Week 2 ideas

Week 1 of the November 2015 SQL New Blogger Challenge is behind us, and I saw some great posts. Now, start thinking ahead to Week 2!

If you need an idea of what to write, it never hurts to follow the crowd. One of the traditions we have in the SQL Server community is T-SQL Tuesday, started several years ago by Adam Machanic. Each month, a different host throws out the topic on the first Tuesday of the month, and everyone participating has a week to write a post on that topic. Posts are due on the second Tuesday of the month.

This month, Mickey Stuewe is hosting, and the topic she chose is “Data Modeling Gone Wrong”. Pull out your stories about horrible data structures you’ve seen, or write something about how to avoid or mitigate problems with data structures. Anything goes, as long as it’s somehow related to the topic. Just make sure to follow the rules in Mickey’s post so people can find your post.

What? You don’t have any good ideas for this T-SQL Tuesday topic? That happens. Another idea is to review Steve Jones’s list of past T-SQL Tuesday topics. There are 71 other ideas on that list, so there’s sure to be something you can write about.

Don’t forget about our rules:

  1. Write – and publish – at least one post per week during the month of November.
  2. Weekly posts should be published by 11:59 PM each Tuesday (November 3, 10, 17, and 24). This week’s post will be due on November 10.
  3. Posts must be published on some sort of public forum. This can be your own website, Tumblr, WordPress.com, Blogger, LinkedIn – anything, as long as it’s public.
  4. Write on any topic you want.
  5. Tweet your posts after they’re published using the hashtag #SQLNewBlogger.

If you do decide to participate in T-SQL Tuesday this week, make sure to follow their rules, as well. (They’re very complementary.)

Go, write! I look forward to seeing what you come up with!


PASS Summit 2015 Highlights

The week of the PASS Summit is always one of the highlights of my year. Even though I keep connected with the community throughout the year, I love getting together with so many friends, both old and new, at this annual gathering of data professionals in one of my favorite cities, Seattle.

Continue reading PASS Summit 2015 Highlights