Saturday, June 3, 2017

When Excel Is Your Hammer

Last week, a counterpart in a neighbouring school district sent me the picture you see at the right.

She'd been talking with a principal about their data and he'd been sketching what it was that he thought needed to have represented.The administrator wants to compare student performance on the reading strand of the state test with their performance on the writing strand of the same test. Although his drawing shows four levels of each, there are really only three reported: below, at/near, and above. Her question for me: Could this be done in Excel?

Um, sure...why not? We're just talking a scatterplot here. Replace the text of the labels with numbers (1, 2, 3) for reading and writing, then just get all up in that scatter chart's business. I sent my friend some basic ideas about how I would approach it, and said I would pull some sample data to model things.

I grabbed some information on 50 of my own students as a start. I replaced the levels reported for each student with numbers (above = 3, at/near = 2, below = 1). Then, I selected the columns with the numerical data and inserted a scatter chart. Easy-peasy, right?

Except, I forgot something important. Many students have the same scores. For example, on the left, we can see that students 3, 6, and 9 all scored in the "at/near" (2) range in both reading and writing. When we plot their points on the chart, they overlap and appear as a single point instead of three students. This was no good. Part of what the principal wanted to be able to see were hot spots---areas of the chart where the school would need to focus for next year. He also wanted to get information about individual students.

I should probably stop my story for a moment here and say that I do not think this---or any other---chart is necessary for the goals the principal stated. If you really just need a list of kids, put a filter on the columns and sort to find the students who are "below" in reading and writing. I suppose that if you really needed to get fancy, you could use a pivot table to summarize things. If you had to have a chart that gave you an idea of the size of the problem, a bubble chart might do. Or, possibly a heat map. I called my friend back and we talked about this. This issue is always the biggest challenge with translating someone's vision into practice. It also gets back to the question I am best known for in my district: What is the problem you are trying to solve? While my colleague agreed with me about the lack of general usefulness of the chart the principal had sketched, she still wanted to produce it. Maybe after looking at it, he'd have a better idea of what he was really after.

So, back to the drawing board for me. I know...I could have left her in the lurch ("Good luck!"), but I appreciate a challenge. Excel was not going to win this one, dammit.

It was then that I decided to jitter the data points. Jittering introduces a tiny bit of randomness to the values so that the points don't overlap so much.

I added two columns (C, E) for the jittered points. You can now see that students 3, 6, and 9 have values that are just a tiny bit different from one another.

The formula in C2 is =B2+(RAND()-0.5)/5. The purpose is to combine the original value with a randomly generated number. It uses the RAND function to create the random values. In this case, I didn't want a lot of noise added to the data, just enough to separate things on the chart. Once in place, the formula is copied down through the rest of Column C, and then applied in Column E to the writing data.

This is what the jittered plots look like, with a minor adjustment made to the axes. Now that I have a few values less than 1 and greater than 3, I needed to ensure those showed up on the chart. The new axis ranges are .5 - 3.5. After making that change, I deleted the labels and used text boxes to add back the original wording. For the data points, I assigned some transparency to the fill so we could better see the overlaps.

We now have a chart that reflects the principal's request. I sent off the file with the sample data and chart to my friend and hoped that it might spur some discussion with the administrator about whether or not this was the right tool for the job he had in mind. Just because we can use Excel doesn't mean we should.

https://twitter.com/fleurdevie/status/2810755338
I don't mean to discount the principal's intentions. Yes, a simple list of students would get you to the same place (and a lot more quickly). But it doesn't necessarily have the same impact as a visual. It may well be that the type of scatter plot shown above engenders some productive conversation with his staff. He has a story in mind that he needs to tell. In that case, maybe Excel is the right hammer for this particular nail.

Sunday, April 2, 2017

Backwards Bar Charts

Recently, someone shared a visualization from Periscopic about the Trump Emoto-coaster. While the subject matter itself was not of particular interest to me, I did like the presentation of it.

Strap yourselves in. Your hands must be this small to ride this ride.
The line chart at the top made me think about the rises and falls within a school year. March seems like an especially cruel month, with teachers' tempers growing short. (Just ask me about how I ended up in a conversation with a five-year old about why we need to wear pants at school.) How do attendance and discipline intertwine? And, when I looked at the horizontal bar cum sparkline plots shown above, it also made me wonder what we would see if we plotted individual classrooms over time. Maybe something like this:

Let's say there are four teachers at a particular grade level in a school. If we looked at the number of student absences and office referrals from the beginning of the year to the end of the year...what might we see?

If I was a principal, I might use something like this to either look for "hot spots" in my school that I might not know about...or monitor how well my school improvement initiatives are being implemented at the classroom level...or even to show staff for input. If I was a teacher, this might give me a general way to compare outcomes in my classroom. It might also piss me off (This just shows you that I have ALL of the bad kids!).

My challenge was how to build this. At its most basic level, this is a floating bar chart. And Ann Emery has a great tutorial for doing just that in Excel. But I didn't take that particular route this time because of how I need these charts to lay out. You see, absences for any given classroom total no more than 70 in a month...but referrals are no more than 13. Excel isn't going to let me push the edge of the chart off the lefthand side of the worksheet if I keep the x-axis the same on both sides, meaning I ended up with a ton of blank space. I suppose I could put attendance on the left and discipline on the right, but hey, what's Excel without some challenges?

So, how do you build a backwards bar chart?

Create your horizontal bar chart the usual way, then fuss a little bit with the axis settings.
Once you do this, then remove the gridlines and axes themselves, you'll be able to position this bar smackdab against the other one. You know it's worth it...you can work it. Just put that chart down, flip it, and reverse it.

Holla!
Another to know about this chart is the addition of the line down the middle. Since I deleted the gridlines and axes, I need some sort of visual between the bars. So, a simple line shape in grey 1.5 pt is all that was added.

In terms of labels, I'm going to leave them off. If you understand how one is laid out, then you can understand a whole school's worth. The numbers themselves aren't the big idea with this visual. It's the patterns and comparisons we're after. When we've identified those, we're ready to ask some deeper questions and dig into the numbers in a different way. These charts are the starting point for conversations...not the end...even if that seems a little backwards.

Tuesday, March 28, 2017

Make the connection: Student growth to teacher action

I have had the privilege of presenting at the ASCD annual conference over several years. I've been an ASCD member nearly my whole career. It's an organization that, as the rebranded conference name suggests, empowers educators in all roles to support students.

This year, I am presenting on the qualitative side of data. My session description is "If 'not everything that counts can be counted,' as Albert Einstein suggested, then how do we measure and represent student growth beyond test scores and grades? In this interactive session, you will learn strategies that capture student learning in multiple ways, as well as how to communicate feedback about the whole child using data visualization. Join the conversation about how to apply digital and analog tools to tell your students' stories and report the full spectrum of student learning."

The challenge of doing a presentation like this is that I have to submit the description more than six months before the conference. Whatever it is that I had in mind to talk about in August was long gone before I received notification that the proposal was accepted...let alone when I sat down to build the content. I am influenced, too, by all the things I have learned in the interim.

The basic story arc did finally emerge. I'll start first by talking a bit about why data visualization can be a powerful tool. This is my usual lead-in, and I think it helps to provide a few easy to grasp examples before launching into new territory. The next hook is to talk about achievement data. Now, this particular piece does not explicitly fit the session description, but my goal is to move from the larger scope of the purpose of data viz to what we typically see in education, and finally into non-traditional ways to represent education data...and perhaps even a little further than that.

I heard a presenter this morning say that "schools embrace business ideas as they are fading." In other words, what was hot in the private sector 5 or 10 years ago becomes the things that schools are talking about now. I have seen this happen a lot over the course of my career. And what worries me most now is that decisions about data privacy and access are being made now that will affect schools in ways they haven't even anticipated yet. I am not going to claim that I can change the world with my presentation and suddenly schools will make these conversations a priority...but it's a start.

My call to action for them is around being in control of creating their own narratives using data and to think about what they want to represent, not necessarily what they are told to represent. All too often, the public view of school data is just annual test scores. But children are so much more than the sum of their test scores. They deserve a more robust approach to sharing their stories (and to be involved in that process, as well).

I have an ancient (by web standards) wiki where I have placed materials for this session. Someday, I'll move everything over to GitHub...but for now, it's a reminder of the journey I've taken to this point and perhaps a place to shape the ideas ahead.

Sunday, March 26, 2017

ASCD 2017: Data Tools

This weekend, I am at ASCD's annual conference, now referred to as Empower. This is not the fourth year that I've attended (and presented...but more on that in another post), but it is the fourth time I've wandered through the exhibit hall with an eye toward what various companies are promoting to schools. The versions from 2013, 2014, and 2015 are available, if you'd like a trip down memory lane. In fact, that might not be a bad idea, because (Spoiler alert!) there was nothing particularly new or outstanding.

Data management and reports
There are no stand-alone systems here this year. Lots of vendors who focus on assessment and grading, however, do have displays and reports for managing student information. I asked the same question that I have asked for the last five years: Who builds your displays and reports? And yes, of course the developers make the magic happen behind the scenes, but there is still the same disturbing number of companies out there where that is the only answer. Or, there may be something along the lines of "we got feedback from teachers." This is all well and good---I support user involvement. I also know that developers and teachers are not data designers. People are spending a lot of time on these products, but they don't care enough to make the effort to ensure that effective communication with the data actually happens.

In one particular exchange, the project creator told me that all her charts were a result of her research. I have no doubt that the project is spawned from many years of hard work with teachers...but I know that she does no research with data design or communication. In fact, she was a little upset that anyone would ask about how she came to make her choices. I won't link to it here, but it's a new partnership with ASCD that you can look for, if you're interested.

Student assessment
There were some different displays this year for various flavours of student assessment. Scantron is making a big show this year. I chatted with them a bit and they have a few sessions this weekend. Perhaps I just hadn't noticed before that they are more than the "bubble sheet" company, but it looks like they're diversifying and growing into student assessment. I can't speak to the quality of this new content, but in an age of apps, google tools, and other options, it seems wise to be more than a company involved with scoring assessments.

Pacific Metrics is a content aggregators for various assessment banks, like the ACT or district-developed items. It produces no reports---it just integrates with existing school information systems. I think this is a desirable option for a lot of districts, especially smaller ones who may not have the resources to develop their own content. If someone else has valid and reliable items for you...and those can be automagically scored and then imported into your gradebook...it could be helpful. It's not a replacement for professional judgment---teachers would still need to ensure that the assessment matches the content.

The product I liked best came from Exemplars K - 12. This company provides not only rich tasks for the classroom, but also scoring tools and anchor papers. This last piece is incredibly valuable. These exemplars are representative of actual student work and can show teachers how to implement the rubrics. While I will always advocate for teachers to come together to develop tasks, score student work, and engage in conversations about student learning as a result, I can't deny that these banks would also support good teaching. I think this is especially important for rural or small districts, or schools with a high rate of turnover so that new teachers have a consistent framework in place.

Have you seen something this year for student data or assessment that you like?

Monday, March 13, 2017

Three: Student Information

In theory, I was going to publish one data story a month this year. In reality, it's March (the school year started in September) and I'm only on my third one. I am way behind on my goal. But I am learning to make my peace with that sad state of affairs. This project is going to run into the next school year...and I'm okay with that.

So let's talk about number three. It's a magic number, is it not?


This month(ish), we're looking at our various student information systems. Each collection of squares represents one system. On the left is Skyward, our district system...and data flows various directions from there to other systems, including TIDE on the right that we use for state assessments.

Each group has layers that are colour-coded by the type of accounts/users it houses. Green is for students, yellow is for teachers, pink is for school administrators, and little red pins are for district administrators. Only one system has blue, representing parents. The sizes of the squares tell you something about the number of people represented by the data set. Each square inch is 50 people. The green squares are largest and district administrators the smallest. All of the systems, excepts for one, include students.

Two of the systems that I chose to represent (SWIS and Google Apps) are connected to our system with a broken line, because there is not a direct data connection. Instead, a system of imports and exports is used.

I also built some charts to show a bit about how families are accessing Skyward. Generally speaking, they log in about twice a month, during the work week, and in the morning.

I don't have any specific data on how many users are represented by our state data warehouse (CEDARS) or GoogleApps. I can only tell you how many data points we transfer in a given week (~250K to CEDARS) or documents we share online (over 600K in Google).

Bottom line: There's a lot of data flowing around.

Questions to Ponder
I selected the topic of information systems because they really are invisible...yet their impact is very real. Me? I'm represented by those nearly invisible red pins in the center of almost every square. I can see all these data, but there are a lot of people who can't due to their permissions or system access. This data story project this year is about sharing data beyond the usual suspects like attendance or achievement. Information systems are a good place to shine some light.

In the end, this is really a story about power and privacy. You'd think that the biggest group in these systems (students) would have the greatest power to use these data, but the fact is simply that they have none at all. Some systems look large, like TIDE, and yet a student or teacher might log in only once or twice a year. Others, like Homeroom, look insignificant and yet they are our most powerful tools for reviewing student information. Looks are deceiving.

Bonus Round
While the offline bulletin board is intended to be a conversational piece, as well as a way to reach audiences that might not have an Internet connection, I always put together an online component, too. This time around, I share a video about the historical origins of personal privacy and provide a way for you to look at how the clicks you and others contribute to our web site add up.

Peekaboo...I see you.

Friday, November 4, 2016

Two: A Month in the Board Room

When I first shared this month's topic for a data story, I received a lot of quizzical looks in return. A month in the board room? Whaaaa? Why would anyone care about that? But I didn't know why anyone---including me---would care. I wasn't sure what I'd see. However, that is one of my larger goals with this project. I want to find out what happens when we pay attention to data that are typically ignored.

A Month in the Board Room is the second in my "10 for 10" challenge I have set for myself this year. It is my goal to tell ten new data stories in ten months. Truth be told, I'm running a bit behind. There is a huge learning curve with these. What I have in my head is never quite what appears on the board and on the web. I try to remind myself that perfect is the enemy of the good. It is more important to finish and post something rather than wait until everything is just right. If I go that route, I might manage to put up only one of these. I am learning more each month. Maybe by the time I get to number ten---whatever and whenever that will be---I'll be a well-oiled machine in terms of getting things posted.

Working with the Data
On the right, you'll see a mini-version of the display that I built from the data. After selecting a month (February), the data were exported from an Outlook calendar into Excel. Each meeting was coded into one of nine categories after reviewing the full list: teacher/principal evaluation, curriculum, assessment, operations, parent meetings, private groups, special services, district office, and administration.

A basic layout, with time of day across the top and days of the month down the left side, was set up with blocks of time for each meeting in the calendar.

I went through several iterations of color coding. There were several combinations of colors I tried that I actually liked better than the one shown at the right. However, while they looked lovely together in some other types of charts, they looked terrible as big bars. This version seemed not only pleasing on the eyes, but also allowed me to divide the information into things that are initiated from within our department (green) and things that are generated elsewhere (brown).

I knew that I had hit on the right display when I showed it to my boss. He appreciates and supports the work I do, but really isn't all that into data. But when he saw this, he actually started engaging and noticing things. After a minute or two, I pointed out to him that he was talking about data...at which point, he smiled and left. Gotcha.

Offline
Just like last month, there is an offline component to the data display---a bulletin board outside my office. The graphic you see above was sent to Costco in two parts and printed as two 16 x 20" photos. A swatch of each color was also sent to be printed as 4 x 6" photos.


This photo is of the board in progress. We have some explanatory text on the left...the poster in the middle (there's more coming) and the "legend" on the right. Each of the swatches is on a card that viewers can lift to see more information about a category, including its name. We decided not to put the names of the categories on the front of the card to encourage viewers to spend some time first trying to interpret or make sense of what they see in the larger poster.

The additions to this display include a second large poster attached over the one you see. The top one has some of the bars turned into options that open and reveal details of the meetings represented. We also have data about occupancy and other meeting spaces attached to the board.

Online
Our companion web page uses an embedded PowerPoint to enable users to see details of every meeting, including links to additional documents and sites. Users can also download a larger data set to explore on their own.

It is my goal to prompt conversation and reflection about data. I am encouraged by the comments I have received about the project and some of the discussions I've had. For example, a couple of people suggested that I not represent weekends on this month's data display. Those days do contribute to a lot of blank space, but it is also an opportunity to think about the importance of what we choose to represent...and what it means when we don't represent something. Sure, our board room is rarely used on weekends, but it just makes me wonder about what activities we might see if we represented what people did when they weren't at work.

Bonus Round
So, what did we learn by representing the data? Maybe no large insights, but it does make it easier to see how different groups access the space. Parent groups are only there in the evenings. Administrators meet either before or after the school day. Special Services only uses the room when no subs would be required. Work around curriculum (science, math, career and technical education, and so on) is the heaviest user. In terms of what we don't see, students are rarely in that space. This is not a surprise. Neither is the lack of meetings around construction and infrastructure---we have a ton of those due to ongoing bond work around the district, but those involved don't need such a large meeting space. If I'd plotted things over the course of the year, I'm sure there would be different patterns revealed, but someone else can take that on.

I am already planning the next one, even as the paint is barely dry on this. At some point in the future, I will have a high school stats class help design a story...and there are plenty of other ideas cooking in the background. I hope that we'll push our data discussions far beyond simple red, yellow, and green-filled cells representing assessment results and into more well-rounded conversations about what we value...and what we do when we don't see those values on display.

Sunday, October 2, 2016

One: If the District Were 100 Students

After attending Tapestry earlier this year, I decided that I wanted to showcase some different data stories. In my day job, I mostly work with student data---test scores, demographics, attendance, discipline, and so on. All good stuff in its own way. But there are lots of things that we collect and don't share, either because of student privacy concerns or just lack of trying.

It is my goal this year to tell ten new data stories in ten months. And while I'm a little later in getting the first one up and running, it's happening. Every story will have an online component with links to programs, data sets, or interactive views. Each one will also have an offline component. I've commandeered one of the bulletin boards in our district office. My goal with that piece is to make "touchable" data, and data displays that can be viewed and experienced regardless of Internet access. Out of all of this, I hope that we bring to light some new understanding to different audiences and create some interest in increasing the visibility of some our underrepresented students.

For (late) September, our focus is "If the district were 100 students." Maybe not the most original topic; however, it has given us a safe place to figure out how to put it all together.

Offline


For the main presentation, we selected six demographic attributes: homeless, low income, absences, dropouts, English language learners, and students of color.  Each of the squares you see has 100 push pins, with the colored ones representing the percent of students in that group.

For three of the groups, we created callouts that provide more detail. For example, our English language learners might only be 2% of our students, but more than 25 languages are represented by that group.

It's been fun to see and hear about people who touch the pins. I'm glad that they feel like they can. I have grand plans in the coming months to employ various paper pop-ups and other things that will invite some exploration with more than the eyes. I had someone comment that seeing the purple pins (representing low income) made her sad. So much of the time, we look at data as numbers on a page. It didn't make the same impact for her as seeing the display.

The rest of the display is devoted to information on enrollment changes, along with some projections by the district and city about the future of our demographics. None of this is earth-shattering or super-fancy, but it feels good to put it out there. It's time to start some different conversations about data.

Online
Each month, we're building a companion web page. This month, I created some simple waffle charts (to reflect the offline displays) and a line graph that users can interact with via Excel slicers. There is a QR code on the bulletin board which links directly to the online options.

A big focus for me this year is on being more transparent about the ethics involved in the choices made about these displays---from which data are (or are not) represented, to downloadable data sets, to the reasons behind the specific charts. It is a privilege for me to have access to the data that I have. It's also a lot of power...and somehow, I need to make sure that I publicly acknowledge that and invite comment.

Bonus Round
Next month...which is really sometime this month...I'll be presenting data related to a month in our board room. I know, that doesn't sound very sexy, but I think the Outlook calendars for that room will reveal a lot about our priorities and partnerships. It's not something we've ever looked at, which is why I think it will be an ideal candidate for this project.

Are you trying something new this year?