Aug 3 • 42M

What a Better Assessment System Would Look Like

A Conversation with Jay McTighe

 
1.0×
0:00
-41:44
Open in playerListen on);
Episode details
Comments

The author of 18 books on education, Jay McTighe is well known among educators, particularly for his book "Understanding by Design" with Grant Wiggins. A regular speaker at education conferences, he's also the author of a more recent white paper for rethinking the nation's assessment system. In this conversation, McTighe walked through the present problems with today's assessment system and the steps he would take to better capture and encourage student learning. As always, in addition to listening to the podcast above, you can also watch the conversation here.

Normally these transcripts are released only to paying subscribers, but today I’m sending this out to every subscriber as a bonus for the start of the summer. If you’d like to subscribe to get this content and more, click below.

Michael Horn:  In addition to being a legend in the circles of education, Jay McTighe is the author more recently of a white paper around assessments that caught my eye because I think it's a topic that is critically important. Too often in education, my sense is, that we think of assessments as something that are either good or bad rather than digging into the nuance and realizing that assessments are important, both in understanding how students and schools are doing, but also because they're a critical tool in learning itself.

                        And yet, today's assessment system, in my judgment, honestly does a pretty poor job on both accounts. And I think it's holding back deeper innovations in practice that we need to allow all students to build their passions and fulfill their potential. So enter Jay's paper against this backdrop, and I think it's one of the smartest distillations I've seen of what's wrong with the current system, and offers a great portrait of something that could easily and feasibly replace it and be significantly better for every stakeholder. So with that as backdrop, Jay, thanks for doing this. It's good to see you.

Jay McTighe:     My pleasure, Mike. Thanks for hosting this and hello to everyone who may be listening.

Horn:                Let's start with your own background in this topic. For those that don't know, how did you come to this question of assessments? In your prior background, have you seen this as an important topic with which to grapple?

McTighe:          Well, I'll try to be brief, but I've been in education more than 50 years, starting my teaching in 1971. And I've worked at school, district and state levels as well as now my work as a writer and a consultant working literally around the world in international education. So my interest in assessment was not front burner for me. My interest has always been in my career, teaching for what I'll call deeper learning, engaging students in meaningful learning, framing learning around authentic tasks, engaging students in, what I hope for them is, authentic work, including emphasizing higher order thinking, et cetera. So my roots were initially in instruction and the kind of teaching that led to those kinds of learnings.

                        Then mid-career, I met and wrote the book Understanding By Design with Grant Wiggins, which as some people know, is really a curriculum and assessment framework, but it's oriented toward the same goals that I described momentarily. Focusing on more deeper learning, teaching for understanding, engaging students in authentic tasks, where they apply their learning in realistic ways.

                        Assessment then, my route to assessment was really through the back door. Like you in your opening statement, I observed over many years, both as a teacher and administrator and even in state ed, the driving impact of assessments and particularly external standardized assessments used for accountability purposes. And it struck me that in many cases that the assessment tail seemed to be wagging the dog, the instruction and curricular dog. And so I got more interested in assessment because I thought if the assessments are not so impactful in how they influence teaching, learning, and curriculum and classroom assessments for that matter, let's put our attention to the tail. Let's think about try to better understand assessment and how to make the tail wag the dog in a better direction, if you will.

Horn:                I love that analogy. A little scary, but I think it's an accurate one of what's happening right now. I'm curious, let's dig into the problem itself. The first third or so of your paper does a really wonderful job of explicating the problem you see with today's dominant assessment system in the United States. I'd love you to detail that and say exactly the challenges that it causes on the ground.

McTighe:          Well, I'll give you my analysis, but I suspect you could talk to anyone who's been teaching more than a few years and certainly any school administrator and I think they could echo and probably expand on this, but to summarize, in the United States, primarily standardized, external standardized tests are used as accountability mechanisms. And I think there's a legitimate purpose and an important question that educators should not shy away from, which is there's a huge expenditure of public monies going into education and the public and policy makers deserve an answer to a basic question. How well are schools doing? How well is this considerable expenditure paying off in terms of learning? And so that's a legitimate question.

                        I think the flaw is that the attempts to answer that question have, as we know, typically come out in the form of a once a year, typically once a year, external standardized test where the results are collected, compiled, communicated, published, and those are used as accountability systems for ranking schools and determining essentially school quality based on a single snapshot test score. So in terms of some of the casualties of such a system, I think these are well known. For understandable reasons, most external standardized tests use a selected response format, IE multiple choice or some states have a short answer component, often known as brief constructive responses. Now, it's understandable. These are testing hundreds of thousands of students. They need to be able to get results quickly. And so you can machine score selected response format test items and get scores quickly. So that's understood. That makes sense.

                        The problem, one problem of course, is that format of selected response is inherently limiting. It does not or cannot appropriately assess all valued educational outcomes. It's good for assessing certain things, do kids know basic information. You can test for basic concept understanding through a multiple choice format. And in some cases you can assess some degree of skill proficiency, but it's often an indirect method at skills. Having said that, there are many, many things that we value that aren't appropriately assessed in that format. And here's the simplest of examples.

                        English language art standards in every state and province and anywhere in the world in fact that I've seen, call for developing student proficiency in reading, writing, listening, speaking, and often research. I know of very few standardized measures, accountability tests, more specifically, that test listening and speaking and research. And right now in the US, very few states now have true writing assessments. What they're using are proxy multiple choice items to make inferences about reading and writing. Well, understandably, the large scale nature of such tests make it not surprising that they're using multiple choice kind of methods, but the casualty is that we're not assessing many things that are at the core of literacy. I mean, listening and speaking are the underpinnings of reading, writing, but we're not assessing those.

                        But that links then to the "high stakes" of these assessments. Schools are judged by these assessments. Districts are judged by these assessments and there are consequences for poor performance. In some states, a school can be taken over by a state entity or a commercial company if they're not performing over time. Administrators can lose their jobs if the scores don't go up, particularly in the low performing. And teachers are under enormous pressure for these as well. Consequently, the high stakes pressure of these tests often drive instruction and drive curriculum and drive classroom assessments toward them. To me, it's like a black hole. It sucks everything into the prescribed format. And so what we potentially see is a lot of what I'll call multiple choice teaching, narrowing of the curriculum to focus primarily on just the tested subjects and the tested skills at the exclusion of some of the other things that may not be tested.

                        I have literally been in schools where I've seen announcements by administrators saying, if it's not in the state test, it should be very low on your teaching priority, even though the standards say we should be doing these things. Even when we go outside of academic disciplines, and not all of those are tested, we have other goals that are important to a modern education, often embodied in what some districts are developing or schools are developing known as a portrait of a graduate. And the portrait of a graduate process, I think, is important and timely because it identifies competencies that we know are important in the world outside of school. All the employer lists of the skills of employment today call for things like the ability to communicate effectively using multimedia, the ability to work well in teams, creative thinking, critical thinking, global or just normal citizenship is often recognized as an important competency. And yet these important competencies are generally not assessed through high stakes, standardized tests in their present incarnation.

                        It's the double whammy, if you will, of the high stakes, the restricted format that conspire to narrow the curriculum and focus teaching around the tested areas and also focus teaching often on developing isolated knowledge and skills, decontextualized, as opposed to, can students understand and put it all together in a more authentic way. Because the tests don't assess in authentic ways.

Share

Horn:                Jay, I just want to double click on something and you do a good job of talking about it in the paper as well. You point out that even given this reality, it's actually not even the best way to move the needle on the current tests that we have is to narrow practice and things of that nature. You use an analogy, the one that I like to use is sort of similar to doctors in the 1800s where we said, "Oh, we got to lower blood pressure, therefore we'll put leeches on you." And yes, it was effective at lowering blood pressure, perhaps not improving overall health. That a lot of the narrowing of the curriculum actually hurts reading scores over tests and things of that nature.

                        Just to play devil's advocate, it's a question I like to ask and I'm curious your take on it, if there are actually better ways to move the needle on these narrow assessments that don't test the range of things that we want to see, but actually a broader set of practices would also help move the needle on these things. Why is it the fault of the tests rather than the reaction of the educators to the test? I'd love you to just go one beat deeper, because I think the pushback I hear to what you just said is sometimes, well, sure, that's true, but we don't want to have the most onerous assessment system ever. You have an answer to that, but we'll get to that in a moment. I'm just curious sort of why you think it's the fault of the tests versus the way we've reacted to the tests.

McTighe:          Oh, I've never claimed that it's solely the result of the tests. I think your question gets at another key point, which is, I think there's a prevailing misconception about how to "raise the test scores" on the traditional use or current round of standardized tests, which are primarily selected response in nature. The misconception is, because the test items are typically multiple choice, they must be low level. And the best way of getting the scores up on those kinds of tests is to do a lot of practice testing. So you give kids practice in the multiple choice format and also covering a lot of content just in case an item might be tested and therefore the students will know it.

                        Well, there's a logic to that. In a sense, if this is the measure, we should prepare kids by doing a lot of practice in the measure, but here's the misconception. The misconception is that because they're multiple choice format, the items are quote low level. If you and any of your viewers are familiar with the depth of knowledge scale, or DOK, a framework developed by Dr. Norman Webb, he profiled four levels of the "cognitive complexity" of an assignment or a task or a test question for that matter. And often what we see in test prep materials that are in place to "help kids practice" and get better at the test format to raise the test scores, are low level. They're level one of DOK, which involve basically recall or very simple skill applications. Whereas, and this is something that viewers and listeners can check out, if your test, if your state or province releases test results and releases item analyses, and you can also see the same thing on NAEP, the national assessment of educational progress. Here's the question. What are the most widely missed items on those standardized tests?

                        They are not, in general, items of basic skills or recall ability. They're DOK three items. That the items, even though they're multiple choice in nature, require some inference or interpretation in reading passages, some reasoning in multi-step application in math problems. They're not low level. And so people are often seduced by the format, IE multiple choice equals low level. Therefore we can drill and practice low level thinking so the kids will raise the test scores. Look, we've had 15 years or more of "test prep" stuff. And there's a whole cottage industry of companies, including some of the test companies for that matter, selling "test prep" materials and schools and districts often have their own version, often known as benchmark or interim assessments that some districts might create. Often those are lower level than the items missed on the state test.

                        Here's my therefore. If you do nothing to change the present accountability testing system, and you're stuck with the kind of test we have now, the best test prep in my view is to give kids lots of opportunities to apply their learning in more authentic ways, engaging them in higher applications, inference, interpretation, analysis, problem solving, creating, as in writing and presenting, and that's going to be the best test prep. As opposed to drill and practice on low level items, practicing the multiple choice format.

                        Just to finish up this little ranch with the analogy, and my colleague Grant Wiggins came up with this brilliant one. He says, practicing for a standardized test to get the scores up, is like practicing for your annual physical exam to get better results. And by the way, I literally have a physical, my annual physical tomorrow, and I had to take my blood test last week so the physician would have the results. Well, if I wanted to practice for my physical, I might have changed my diet the last couple days to cut out sugar or carbs or whatever to kind of skew the results. But if my doctor knew I was doing that, she would say, "No, Jay. Uh-uh (negative). Just keep up your normal routine because we're just sampling and we're going to get a sample of your health."

                        Now, the analogy breaks down in one kind of interesting way. In medicine, if you take a blood test or any kind of test and the results are abnormal or out of the ordinary, or maybe disturbing, what do they do? They say, "We have to do some further testing." Because they recognize that the blood test or the annual is just a sample of a few things. What do we do in education? We publish the results in the paper.

Horn:                We just move on.

McTighe:          As if that one snapshot is the be all and end all. But anyway, I like the analogy that Grant came up with.

Horn:                It actually lends itself well, though, then into the solution to the assessment problem, that could perhaps be part of the reason that educators start to feel that they can embrace more deeper learning and more robust, real life projects as part of the learning for students, and you go into a three part solution. I'd love you just to outline at a high level what, what you conceptualize and then we can dig into each of the three components.

McTighe:          Great. So assessment fundamentally should be based on our goals. And so we need to start, before thinking about an assessment or assessment system, we need to think about our goals. And categorically speaking, I propose that there are three broadly cast types of learning goals. And these aren't new, but they're important to keep in mind. There's what Grant Wiggins and I have called acquisition goals. Namely, what knowledge and skills do we want students to acquire as they're learning new things. Now, by knowledge, I'm talking quite literally about factual knowledge or basic concepts. And skills are just that, simple skills, to more complex skills going into processes. Those are what I call acquisition goals, things we want students to acquire.

                        We also have what I've called understanding goals. And an understanding is more than just a fact. An understanding is about a more conceptual idea, a concept, a principle, LE. And also a process understanding the nature of scientific methods or what makes good persuasive writing. There's understanding about those processes. And thirdly, we have transfer. Defined as the ability for a learner to take what they've learned in one context and apply it effectively and appropriately in some something that's new. So it's not just rote, it's not just recall. They have to think and apply their learning effectively.

                        Arguably, we have those three goal types. And by the way, you can analyze published standards in subject areas. And you can see that there are all three goal types in the standards. And the so-called portrait of a graduate competencies or 21st century skills really call for transfer. We want to develop critical thinkers, not on a given issue, but on issues they encounter in their lives, and so forth. So if you take those three goal types and ask the question then, what assessment evidence should we collect that will help us know how kids are learning those things? That to me implies a multifaceted system.

                        Here's my analogy, then I'll summarize the three types that I'm proposing. Think of any assessment as a photograph or a snapshot. Like a snapshot, it's revealing. It gives us a picture of something, but a single picture is inadequate to showing the full range of a person or a situation. For that, we need a photo album. A collection of pictures taken over time is much more revealing, much more informative than any single picture within. So let me start by saying, let's think about an assessment system that has multiple pictures as opposed to a single once a year snapshot, because that's inherently limited. And psychometrically, we are less able to make sound inferences from a single piece of evidence than we are from multiple sources.

                        So having said that I'm proposing a three-part system, think of it as a three legged stool. One part would be similar to what we have now, content oriented tests to test important knowledge and skills that students should acquire, typically within subject areas. The second one is a set of performance tasks that are more authentic and then engage students in applying their learning in realistic situations, thus giving evidence of understanding and transfer. And the third leg of this three-legged stool would be curriculum embedded local assessments. And this would bring in a variety of things, including traditional course exams that many high schools have now, but also things like genius hour or personal projects, passion projects that kids are interested in. It could involve exhibitions. It could involve a variety of more authentic learnings that are otherwise outside of the mold of traditional tests. So that's in a nutshell, the three legged stool based on the fact that we have different, but important goals, that should be assessed. Photo album, not snapshot.

Horn:                That's perfect to start to lay out what this could look like. Let's drill into each of those one by one, and then we can wrap up. Because I realize we both get excited about this topic and we go longer than we intend. The first one, in terms of the content specific assessments themselves, importantly, you have this sort of like NAEP, if I understand your proposal correctly, that it would be a sampling mechanism. So not everyone in the country will be sitting down to take the two hour test each day for two weeks or whatever it might be. But instead some subset say 100 students in 50 schools or something like that, or 50 students 100 schools, whatever, would take some portion of a set of content tests in any geography basically to get a sense for, are we accomplishing these knowledge and skill acquisition goals themselves? Am I getting that correct and what would you amplify on that?

McTighe:          Well, there are two parts to the first leg of this stool. Basically standardized tests, pretty much like we know now, selected response, maybe some brief constructive response formats, testing content, knowledge and skills, kind of acquisition level. My simple proposal is why do we need 50 different sets of state assessments? It's a huge expense. Why not use a system that we know and is psychometrically sound, NAEP. We could use NAEP tests. We have them already and use them nationally. So that's part one. But even if every state wanted to do their own thing, you could still do that because you have that now.

                        The sampling part of that equation, however, goes back to the purpose of standardized testing. My contention is the primary purpose of external standardized testing is for accountability to answer the question, how well are schools doing? To answer that question you need comparability, which to me, why would you want to have 50 different sets of state tests that aren't comparable? Use NAEP, now you have a comparable measure. But secondly, your goal is not individual scores. Your goal is not to say how well is Jay's grandchild doing in third grade? Your goal is to get a broad brush look at how the schools are doing overall, so you don't have to test every kid on every item. You can sample and thereby save a huge amount of cost and testing time while still getting the data you need to answer the school accountability and district accountability question.

Horn:                Super interesting. And just to be clear, you could use the NAEP infrastructure, if I'm understanding you correctly, which is already given on a biannual basis to accomplish at least large parts of that. So you would save actually a lot of cost and time from what exists currently in these snapshot, once a year assessments.

McTighe:          The only difference I would suggest with NAEP is if you were going to implement the ideas that I'm putting forth, you would do more than sample 1000 eighth graders. You would probably test every kid in a school and district that is currently tested. The difference is you're not going to test them on every single item. You would have a sample of kids doing the math and a representative sample doing the ELA and so on.

Horn:                Gotcha.

McTighe:          But you could use the NAEP items as the content of the test.

Horn:                Gotcha. Okay. That's super helpful. So then on the second part of this, the projects, the authentic projects that students would be doing, in the paper, you talk about how you would establish rubrics and validity and things of that nature and training teachers to be able to do this scoring, much as we do, say, in AP exams or things of that nature. Talk a little bit more about what that would look like. How it's feasible and practical from a cost and training perspective and what you would envision that ultimately looking like. Would these be teachers in one school assessing the projects of students in another school or what would this actually look like?

McTighe:          The second leg of my three-legged stool is the big one. I'm going to describe them as performance tasks rather than projects.

Horn:                Yes. Thank you.

McTighe:          Projects tend to be longer and more student directed. These are going to be developed tasks, well developed tasks. Some will be within subjects, but often the task will spill over from one subject to another. So you might have a language arts and social studies task, or you might have a science task that has a writing component for instance. These are well developed performance tasks, and they are intended to be curriculum embedded. And by that I mean, rather than saying, "Oh, we have to stop teaching and now do these tests." The assessment should be an outgrowth of our teaching. They should be derived directly from standards, but because they're performance tasks, they would draw on the standards that involve performance. They would call for transfer. Can students apply their learning to this new situation? That's what the task sets up.

                        Now, even though this is the more demanding part of my three-legged stool, it's not unprecedented. Your example of AP scoring is a good one. My wife is in the arts and she participated in AP portfolio reviews in visual arts. We have music adjudications. But also, and this is one of my personal experiences, in the '90s there were several states that had statewide performance assessments, Kentucky, Connecticut, Vermont, and my state of Maryland. And I worked for nine years at the Maryland Department of Ed during the era where we developed our first generation of standards and, companion performance based state assessments. For nine years we had no multiple choice items on Maryland state tests. And I watched that through the lens of my children going through Maryland schools. And I saw the impact that the state performance assessments had on their learning and what they did at home and so on. It was for the good.

                        But the real challenge in large scale performance assessment is the cost and the time required to score them, because you're not going to run a student essay through a Scantron machine or view their oral presentation. And that was the downfall of most state performance assessments even though the intention was good. It was not manageable or affordable.

                        My proposal is straightforward and I believe it's doable, although I'm not confident that people will embrace it and make it happen. And that is, what if we had a series of well developed performance tasks and well developed rubrics that were given periodically during the course of the year, again, sample, not every kid takes everything, every one. And for the scoring we had, let's hypothetically say, three times a year, schools in a region would close and teachers in those schools and administrators would come together at scoring sites to actually review the work that kids produced on the performance tasks and score them against well developed rubrics, anchor papers, learning inter rater reliability protocols. We know how to do this large scale.

                        The benefit to me, I witnessed this, so I will go toe to toe with anyone who challenges this point. When teachers learn how to look at student work in teams against a good rubric with an inter rater reliability protocol and anchor samples of the levels tied to the rubric of student work, it is one of the most powerful, professional learning experiences that professionals can get. In so doing, you really come to understand the standards being assessed. You look collectively at student work through the lens of a good rubric. And so you're really sharpening what makes quality work, what are the most salient traits? And even more importantly perhaps, when we did scoring sites in Maryland, it wasn't just about getting a score or a number, teachers were invariably talking about what this means for my teaching.

                        If you spend a day with colleagues looking at student work on math problems or social studies, language arts tasks, whatever it might be, you begin to see not only the strengths in student work, but the areas of weakness. And much of the conversations in those scoring sessions was, how do you deal with this problem or do you know any resources for that? There was a lot of discussion about how can we improve the performance that we're seeing. So in a sense, what I'm calling for is an organized, structured, systemic approach to professional learning communities writ large.

                        Here's a quick analogy, football coaches that I've known over my many years in education often will look at game film from Friday night or Saturday's game as a team of coaches. They'll analyze the team's performance in the game and they'll focus next week's practice on addressing the weaknesses. That's what I'm calling for, a system that can do this. We can get reliable scores using teachers with the proper training and well developed rubrics and anchor samples.

                        And as to the cost, if you only label that scoring an assessment cost and you pay external companies to do it, it's unaffordable. If you conceive of it as a professional development/assessment system cost, and you do it as part of teacher's jobs, three times a year, it's well worth it. There's a lot of nuance in all this, of course, and the details, but that's the construct. And I have seen it work. I know it can work. I'm not wildly optimistic that the organizational system will be enacted to make it work.

Horn:                It's interesting, just because you've dealt with the cost question about unlocking the professional development funds, it creates a much more purposeful, I would argue, professional development rather than what currently happens, where district pays some schmo like me to come in and do a day with the teachers and so forth. I think on multiple levels, it's very interesting. And the inter rater reliability piece of it built into the architecture, creates something that I think would address a lot of people's concerns historically about moving to more performance based tasks in the course of these assessments. I found it very interesting.

                        Let's with the remaining time move to the third leg of the stool, the local assessments. Which in many cases already exist. They already go on in schools. I want to call out, I think a couple nuances that I think I see in what you've proposed, and then get your take on those specifically. One is, you have this notion of audits being part of it. So to make sure that these local assessments that are being used are in fact valid and reliable, we can believe what they are telling us.

                        And then secondly, and this is maybe me inferring something that's not there, so I'm curious. But it seems to me, we could use the content assessments from the first leg of the stool and the sampling mechanism just to ask ourselves, are the local assessment results that schools are reporting out broadly in tune with what we would expect from these content assessments? Meaning they're moving together in some ways. And it won't be a perfect one to one because the local assessments will be far broader. There'll be things that we don't even think to test on the content specific assessments, but they may identify certain areas where, whoa. Those are wildly divergent results that might cause an auditor to go a little bit deeper. And so they sort of reinforce each other or check each other in some ways. What's your take on all that and what am I missing?

McTighe:          I'll come to that in a moment, but let comment on one of the underlying purposes for this third leg of the stool, local assessments. First of all, one of the things I think an assessment system needs to be mindful of is, whether locals, and I'm thinking about students, teachers, parents, administrators, have any skin in the game. And let's be honest about another reality that some teachers and administrators recognize. In some states, in many states, in fact, where the standardized accountability tests don't count for students, sometimes kids will blow it off. I mean, I've been in schools where I've seen kids during test day, making pictures on the bubble sheets, cause they don't give a rats you know what about the test because it doesn't count for them.

                        And often teachers will disown the tests. They'll see them as an unwelcome intrusion so they can get back to teaching when they're done. We want a system where people say the assessments matter and that we have skin in the game and we're going to take them seriously. Students are more likely to take something seriously if it's going to count for them, so these local assessment should count. They should be part of assessment and grading and so on. That's point one.

                        Point 2 is, with the photo album analogy, the first leg of the stool, multiple choice tests, more content specific, might be like a wide angle lens. We're going to sample a lot of content through these tests. The performance task might be more like a close up or a macro view where we're going to go deep on a small number of tasks to see if kids really can apply their learning. To me, the third leg of this stool might be somewhat in the middle. Or to say it a different way, we want to complement things that are falling through the cracks from the first two legs. And so that may be in the form of traditional course exams or end of year tests that some schools have now, especially at the high school level. We're going to test some things that aren't on the other two legs. But it also could allow us to assess things that we know are important, but are otherwise eluding the first two, including giving kids some choice and voice in how they're showing what they're learning and what they can do. So that's the purpose.

                        Regarding your sampling question of how might these correlate? I guess the correlation question.

Horn:                Right.

McTighe:          I would be cautious about that. For me, that wouldn't be the primary purpose.

Horn:                Okay.

McTighe:          Because ultimately, and arguably you're going to be assessing different things on all three legs of the stool. In general, you might say, "Yeah, we should be able to see some broad patterns." But I wouldn't want to say the standardized test are predictive of what the kids do in local assessments or vice versa, so I'd be cautious on that note. At least that's not my primary goal here.

Horn:                So the auditors, you would lean on that for that part of the process, just to make sure that there's some valid and reliability.

McTighe:          The auditing part is just to kind of keep people honest, but also recognize that schools can become somewhat siloed and their frame of reference is their internal operation. And so we might, for instance, just hypothetically, see a science assessment or a math test in algebra one or in third grade science being remarkably diverse and hugely different in terms of what's being assessed and the cognitive demand from school a to school B to school C. So an audit would be simply, periodically, let's just check who's doing what. What's the nature of these local assessments?

                        The reality although, is they're not going to be comparable. The first two legs of the stool are comparable and there's a need for comparability and accountability testing. But the third leg, again, has kind of a different purpose, but in large measure, it's meant to round out the assessment picture and to make sure that we're assessing all the things that we claim to value, not just the things that are easiest to test and quantify.

Horn:                Super helpful. Jay, really appreciate you being on here. I think the proposal, like I said, really addresses a lot of the challenges with today's assessment system, but not only that, would help bring our schools to something more robust in line with the goals that you laid out at the beginning of the talk. So I've provided a link here again to the paper, measuring what matters. You can see it on the screen. If you're listening to this, I'll post it in the notes and in the transcript so people can check it out. Read it.

                        And I hope despite your contention, Jay, that they take seriously all three steps and focus on the second one around the performance tasks to see those implemented. I think it'd go a big step toward helping us create something more robust, both in assessment and practice and showing what we truly value for students. Really appreciate you being here and for all you tuning in, thanks again for joining us on the Future of Education. We'll see you next time.