Message: To Print

To Print
FromSireno, LisaDate  Monday, September 18, 2017 1:21 PM
To
Hagenhoff, Margaret
Cc
SubjectTo Print

Margie,

 

Please make 14 copies of each of the attached documents.  I need them ASAP.

 

Thanks,

Lisa

 

Lisa Sireno | Standards and Assessment Administrator | Office of College and Career Readiness

Missouri Department of Elementary and Secondary Education | 573-751-3545 | dese.mo.gov

 


Missouri Assessment Program

Technical Advisory Committee (TAC) Member Biographies

Andy Porter, TAC Chair, is George & Diane Weiss Professor of Education, emeritus at the University of Pennsylvania where he was dean of the School of Education. Previously he was a professor at Michigan State University, Anderson-Bascom Professorship in Education, University of Wisconsin-Madison and the Patricia and Rodes Hart Professor of Educational Leadership and Policy, Vanderbilt University. Currently, his work is supported by the US Department of Education, Institute for Education Sciences with which he directs the Center for Standards, Alignment, Instruction and Learning (C-SAIL).

Trained as a statistician/psychometrician, Dr. Porter’s research on teacher decision-making created the Surveys of Enacted Curriculum (SEC) tools for measuring the content of instruction and content alignment (e.g. content alignment of student achievement assessments with state content standards) as well as teacher log procedures. He is senior author of the VAL-ED assessment of school leadership. He is an elected member of the National Academy of Education, recipient of the AERA Distinguished Contributions to Education Research Award, Lifetime National Associate of the National Academies, past-President of the American Education Research Association and was for eight years a member of the National Assessment Governing Board.

Bertha Doar is Director of Assessment, St. Louis Public Schools. Dr. Doar’s background includes research and evaluation roles in Missouri schools and adjunct faculty positions. Prior to joining St. Louis Public Schools, she served as Director of Data Analysis and Quality Management at Rockwood School District for nine years. Her professional focus is building educational systems to facilitate positive growth. She is a leader of the St. Louis Assessment Resource Cooperative group, which helps educators learn more about assessment, data analysis and educational program improvement. Dr. Doar received her Ph.D. from Washington University in St. Louis.

Dr. Karla Egan, EdMetric LLC, is a nationally-recognized expert in psychometrics. She advises state and national practitioners as well as policymakers on assessment systems and practices. Throughout her career, she has designed and implemented the psychometric aspects of statewide summative assessments, published papers on standard setting and test security, and presented papers on issues related to assessment design, test security, and standard setting. Dr. Egan created innovative framework for achievement level descriptors that was used by Smarter Balanced, and she served on the National Academy of Sciences committee that evaluated NAEP reading and mathematics achievement levels. Dr. Egan serves on technical advisory committees for Dynamic Learning Maps, Louisiana, Missouri, and North Dakota, and she chairs the Indiana technical advisory committee. Dr. Egan received her Ph.D. from the University of Massachusetts, Amherst.

Dr. Ronald Mertz served 23 years in the St. Louis Public’s Division of Evaluation, Research, and Assessment as a divisional assistant, assistant director, and director. During his time with the District, he coauthored or directed more than 50 evaluation/assessment reports, managed the District’s assessment program and began serving on Missouri’s technical advisory committee (TAC). Since retirement, he has continued to serve on the Missouri TAC and has conducted a number of program evaluations as well as a needs assessment for the International Institute of St. Louis. He presently serves as the evaluation/assessment consultant for an early childhood intervention program in Liberia. Previous to his service in the St. Louis Public Schools, he conducted research among diverse populations in Taiwan, Belize, Florida, and Arizona, and held academic positions at Taiwan Provincial College of Education and at Jacksonville State University, Alabama where he also served as co-department chair. His earlier teaching experience was as director of a Seminole Tribe of Florida Head Start Center, an elementary school teacher in Michigan, and a Peace Corps volunteer in Liberia. Dr. Mertz received his Ph.D. from the University of Arizona.

Barbara S. Plake is Distinguished Professor Emeritus at the University of Nebraska-Lincoln where she was on the faculty in the Department of Quantitative and Qualitative Methods in Education and Director of the Buros Center for Testing for nearly 30 years. She specializes in standard setting, validity, and computerized adaptive testing. She has authored over 200 journal articles, book chapters, conference papers and other publications. Her work has been published in journals such at The Journal of Educational Measurement, Educational Measurement: Issues and Practices, Educational and Psychological Measurement, Educational Measurement, Applied Measurement in Education, Applied Psychological Measurement, and elsewhere. She is a contributor to The Handbook of Educational Measurement and co-editor of The Mental Measurements Yearbook and Tests in Print. She is founding co-editor of Applied Measurement in Education.

Dr. Plake is a consultant on testing with several states and organizations. Her research focuses on classroom assessment practices, computerized testing, and on methods for determining the passing score on high-stakes tests, such as those for high school graduation eligibility decisions. Dr. Plake is an active member of the American Educational Research Association (AERA) where she was inducted as a Fellow in 2008. She served as Secretary and Program Co-chair of Division D (Measurement and Research Methodology). She was President of the National Council on Measurement in Education (NCME) in 1992 and served on their Board of Directors from 1986-1993. In 2006 she received NCME’s Award for Career Contributions to Educational Measurement. 

She is a Fellow of Division 5 (Measurement and Research Methods) of the American Psychological Association (APA) and co-chairs AERA, APA, and NCME’s Joint Committee on the Revision to the Standards for Educational and Psychological Testing (with Lauress Wise). In 2005 she received the Career Achievement Award from the Association of Test Publishers. Dr. Plake received her Ph.D. in Educational Measurement and Statistics from the University of Iowa in 1976. She was a Research Associate at American College Testing Programs (ACT) in Iowa City, Iowa before joining the faculty at the University of Nebraska-Lincoln in 1977. Previously she was an analytical engineer at Pratt & Whitney Aircraft and a middle school mathematics teacher.

Edward Roeber currently serves as a consultant in educational assessment to various organizations.

Prior to these activities, Dr. Roeber served as an assessment advisor on student assessment to the WIDA Consortium at the University of Wisconsin. He was also an Adjunct Professor, Measurement and Quantitative Methods, in the Michigan State University College of Education, East Lansing, MI from 2007 to 2012. In this capacity, he taught courses on educational measurement, worked to improve the assessment skills of prospective and current educators, conducted research on how teachers learn to use formative assessment strategies, and provided additional support for faculty and students on assessment.  
 
Previously, he was Senior Executive Director, Office of Educational Assessment & Accountability in the Michigan Department of Education from 2003 to 2007. He oversaw the assessments of general education students (in mathematics, science, language arts and social studies), students with disabilities and English language learners, as well as the accreditation and accountability programs.  
 
From 1998 to 2003, he was Vice President, External Relations for Measured Progress, a non-profit educational assessment organization located in Dover, New Hampshire. He worked with state policy leaders and staff of state and local education agencies to help design, develop, and implement quality assessment programs. He directed the company’s efforts to develop alternate assessments for students with significant disabilities for thirteen state clients. He also helped to develop high school fine arts assessments in dance, music, theatre, and visual arts for the state of New York.  
 
From 1991 to 1998, he was Director, Student Assessment Programs for the Council of Chief State School Officers. He developed and implemented the State Collaborative on Assessment and Student Standards (SCASS) project including various collaborative assessment activities among states in assessing special education students, assessment systems for Title I, and arts assessment, as well as full-scale, multi-state collaborative development projects in health education, science education, and social studies. He also directed or co-directed the CCSSO National Conference on Student Assessment from 1992 to 1998. 
 
Prior to joining CCSSO in 1991, he was supervisor of the Michigan Educational Assessment Program, Michigan Department of Education, Lansing, Michigan (from 1976 to 1991) and a consultant in the Department from 1972-76. He began his career as a consultant with the Education Commission of the States, working on the National Assessment of Educational Progress in the areas of mathematics, music, reading, science, visual arts, and writing from 1969 to 1972.  
 
Edward Roeber received his Ph.D. in measurement and evaluation from The University of Michigan in 1970. He has consulted with a number of states as well as national organizations on the design, development, and implementation of large-scale assessment programs. He has authored numerous articles, reports, and other publications, particularly on the development of innovative assessment programs and the use and reporting of student achievement information. In addition, he has made numerous presentations to various groups around the country.

Phoebe Winter (Ph.D., Columbia University, Teachers College) conducts research in improving online assessment and contributes to the design of assessment systems that incorporate technology. Her work with state and non-governmental education agencies focuses on bringing policy, psychometric, and practical perspectives to the design and implementation of educational assessment and accountability programs.

Dr. Winter’s career has included positions as a measurement and quantitative specialist in the South Carolina and Virginia Departments of Education, where she lead the development and analysis of student and teacher assessment programs; project director for CCSSO’s State Collaboratives on Assessment and Student Standards, including being the founding member and subsequently the coordinator of the Technical Issues in Large-Scale Assessment SCASS; research director at the Center for the Study of Assessment Validity and Evaluation at University of Maryland; and executive vice president for education policy at Pacific Metrics, where she designed and directed research projects, consulted with state departments of education, and oversaw the company’s research department.  Currently, she provides consulting to state departments and the federal department of education, university-based research projects, and non-profit educational organizations.

Dr. Winter has conducted research, written and edited books and articles, and provided technical support directed at strengthening the validity of uses and interpretations of results from educational measurements. Her recent research addresses the comparability of inferences from tests administered under different conditions and the nature and degree of information provided by traditional and technology-enabled item types in mathematics and science.

Dr. Winter is currently the Secretary of AERA Division D (Measurement and Research Methodology) and has served as Chair of the Division D Significant Contribution to Educational Measurement and Research Methodology and Mentoring Committees and Chair of NCME’s Outreach Committee. Her research and professional service reflect her goal of bringing thoughtful, cross-disciplinary research employing both quantitative and qualitative methods to bear on the improvement of educational measurement so that it contributes meaningfully to teaching and learning. Winter’s Ph.D. is in psychology with a concentration on measurement, evaluation, and applied statistics.




                                                            8/26/2017

To: Shaun Bates and Lisa Sireno

From: Andy Porter, Chair, Missouri Technical Advisory Committee

Subject: Minutes of Missouri TAC Meeting on August 17 and 18, 2017

The Missouri Technical Advisory Committee met at the Renaissance St. Louis Airport Hotel from 10:00 AM to 5:00 PM, August 17, 2017, and from 7:30 AM to 12:30 PM, August 18, 2017. Members of the TAC in attendance were Bertha Doar, director of assessment, St. Louis Missouri Public Schools; Karla Egan, independent consultant; Ron Mertz, St. Louis Public Schools, retired; Barbara Plake, University of Nebraska, retired; Andy Porter, chair of TAC, University of Pennsylvania; Ed Roeber, independent consultant; and Phoebe Winter, independent consultant; In attendance from DESE: Lisa Sireno, Shaun Bates, John Kitchens, Commissioner Margie Vandeven, Deputy Commissioner Stacey Preis, Assistant Commissioners Blaine Henningsen, and Chris Neale. In attendance from DRC: Lindy Wienand, Joanna Tomkowicz, Sara Brazzle, Rick Mercado. In attendance from Questar: Adam Johnson, Mike Woods, Sandra Durden, Katie McClarty and Scott Bishop. In attendance from MetaMetrics: Ellie Sanford-Moore.

Contractors did not join the meeting until lunch on the first day.

Missouri update

Lisa Sireno provided the TAC with an update on assessment activity in Missouri. The Missouri Show Me Standards provide process and content standards for the state. The Missouri Learning Standards address Grade-Level and course content. New fine arts standards will soon be drafted, but no assessments of them are planned at the present time. The personal finance standards will soon be revised and a new assessment is planned for 2019/20. Missouri adopted new learning standards for mathematics, English language arts (ELA), social studies and science in 2016. The Missouri peer review submission to the U.S. Department of Education was largely approved for Algebra I, English II and grade-level Science as well as MAP-A in mathematics and ELA.

MAP-A continues to be provided by Dynamic Learning Maps (DLM) for grades 3 through 8 and 11 in ELA and mathematics and grades 5, 8 and 11 in science. The state continues regular assessments in ELA and mathematics for grades 3 through 8 and in science for grades 5 and 8. Data Recognition Corporation (DRC) is developing interim assessments for Missouri. End-of-Course (EOC) testing continues. Sireno noted that this past year, approximately 20% of the eighth grade students took the Algebra I EOC.

Missouri continues to use the WIDA consortium for assessing English language competence. For the past three years the state has done census assessment using the ACT for 11th graders. Recently the state cut $4 million from the assessment budget (out of the approximately $18 million budget provided by the state, supplemented by $7 million from the federal government). The current plan is to address this shortfall by discontinuing the state-funded ACT census assessment at 11th grade.

There will be new assessments for Grade-Level testing in ELA and mathematics and the EOC assessment in 2017/18. There will be new assessments for fifth and eighth grade Science and the EOC assessment in 2018/19 and in social studies for 2019/20 but only in high school.

DESE has published several resource documents for state educators concerning the new standards.

Sireno cited as challenges to assessment in Missouri the frequency and size of changes to state standards and the impact on assessments. The number and frequency of changes has created a great deal of confusion among K-12 educators in the state. There have also been challenges to the EOC assessments, a topic of discussion for later in the meeting. At the end of the conversation about challenges, the TAC noted that many of the materials for the current meeting were received too late to be read in advance and that this practice needs to be corrected.

Grade-Level assessments technical report early tables for 2016 /17

Materials were received in advance of the meeting and Joanna Tomkowicz from DRC walked the TAC through the results for ELA, mathematics and science. In addition to reporting percent of students at various proficiency levels and mean scale scores, there were also results on interrater reliability and differential item functioning (DIF). When investigating subgroup differences, there continue to be large differences between average performance for black students and white students and general education students versus students with disabilities. Tomkowicz noted that no items were dropped from use based on the DIF analyses. DESE reminded the TAC that all items were reviewed in advance of their initial use by appropriate state content committees. The TAC wondered whether it would be useful to consider bringing in some “advocacy groups” to look at DIF flagged items. The reliabilities of the assessments for subgroups were investigated and reported as good to excellent despite the possibility of restriction in range for some of the subgroups in terms of their performance on the assessments.

The TAC thanked Tomkowicz and DRC for the quality of their work and was pleased to see that the Grade-Level assessments continue to function well in Missouri.

Grade-Level assessment data forensics

Reports were circulated to members of the TAC in advance of the meeting on analyses of answer changes, test time and item time. In addition, there were analyses reported of the possibility of students copying answers from one another. Overall there was very little evidence of results flagged as suspicious and possibly needing investigation except for test time. Tomkowicz noted that for the forensics analyses of time, the amount of time for an individual student is not necessarily the real time because students didn’t always remember to log out.

The TAC did note that the reports refer to flagging in terms of numbers of standard deviations when in fact the flags were identified in terms of number of standard errors. This should be corrected. In the future, DRC will investigate the possibility of unusually large gains in student achievement from one year to the next. These results will also be analyzed by crossing them with wrong to right answer change analyses.

The TAC complimented DESE and DRC for their attention to data forensics in detecting the possibility of cheating. Further, this commitment and the resulting analyses are surely serving as a deterrent to any cheating in the state. Sireno reported that DESE is developing follow-up plans in response to data forensics flags of classrooms and schools. The TAC requested that they have an opportunity to review these follow-up plans when available. The TAC recommends that the forensics analyses also look to see if there are any school districts that receive a surprising number of flagged classrooms/schools and that data analysis be added to the current analyses. The TAC noted that flagged/classrooms might sometimes be an indication of administration problems and those possibilities should be investigated as well.

Embedded field tests in spring 2017

In spring 2017, 393 ELA items and 612 math items were field tested. Each item was taken by approximately 3,000 to 4,000 students. For ELA, 73 items were flagged largely for low or negative correlations with total score. For mathematics, 203 items were flagged primarily because they were too difficult. All flagged items were reviewed by DESE employees and dropped or modified as appropriate.

Multi-select items

Multi-select items are in a multiple-choice format, but instead of only one right answer, several of the answers provided below the item stem can be correct and appropriately chosen. While there is not a great deal of evidence in support, some believe that these multi-select items can test content in the standards that is difficult to test with other item formats. A sheet of rules for writing these types of items was distributed at the meeting, and discussed in detail; revisions were suggested by the TAC. Generally, the TAC favored a single set of rules to apply to all three subjects: ELA, math and science. Further, the TAC favored simplifying the rules where possible.

Providing math formula sheets with the assessments

Thus far in Missouri assessments, sheets providing a variety of mathematical formulas that are accessible to the student while taking the exam have not been provided. Nevertheless, there are some who would prefer that formula sheets be provided. The TAC asked how the state content standards address this issue and was assured that the standards were silent on the matter. The TAC recommends that the state continue to not provide formula sheets for mathematics assessments. If there should come a time when an item is written which is meant for the formula to be provided, that formula can be provided in the item itself. If that happens, care should be taken the formula provided does not change what is tested by any of the other items on the test form.

2017/18 ELA and mathematics vertical scaling plan

The TAC was provided in advance of the meeting an email from Joanna Tomkowicz to Lisa Sireno and Shaun Bates describing DRC’s plans for vertical scaling for the spring 2018 MAP assessments in ELA and mathematics developed to be aligned to the new Missouri learning standards. DRC proposes to use a set of common items between spring 2017 and spring 2018 operational tests in every grade ELA and mathematics to serve as anchor items in developing the new vertical scales. In addition, sets of items from the grade above and the grade below will be administered to samples of students taking on Grade-Level operational tests to facilitate between- grade assessment linking. A hybrid approach to building vertical scales as recommended by the TAC in a previous meeting will be used. In this approach, concurrent calibration results are treated as an initial scale the operational assessments. The initial scales are then equated to the existing ELA and mathematics scales using the common anchor items between the two administrations. Concurrent calibration of test data for all grades and content areas will then be conducted to build the vertical scales.

The metric of the new vertical scales will be established to have means and standard deviations substantially different from the previous scales to signal that these are new assessments for new standards and comparisons of results to earlier years should not be made.

The TAC complimented DRC on their plans for building the new vertical scales. The TAC did recommend that vertical linking items be carefully selected so that they reflect the best available items for spanning alignment of content between the two adjacent grades. The TAC would like to see the results for building the vertical scale using a) items from above and below grades, b) items from only above grades and c) items from only below grades before deciding which approach should be used.

Interim assessments

DESE has asked DRC to develop an interim assessment for grades 3 through 8 in ELA and mathematics and for grades 5 and 8 in science to be ready for the 2017/18 year. These interim assessments consist of one form per subject and grade and are also being called “pretests.” The TAC asked what the goals were for these pretests and was told they would be used as practice tests that mimic the operational tests in format and length. The practice tests could be taken any time and as many times as desired between the period November and June. The plan is to calibrate the practice test to the operational test using first-time test-taker data only. Results would be reported on the same scale as the operational test and end of year proficiencies would be reported.

The TAC liked the plans, but recommends that the tests be called “practice tests,” not “interim assessments” nor “pretests.”

Vertical articulation between middle and high school assessments

DESE has asked its two contractors (DRC and Questar) to work together to better articulate high school assessments and Grade-Level assessments when new standards are set for each. The two contractors jointly talked to the TAC about their plans, saying that a formal proposal for vertical articulation will be presented to DESE and the TAC at the next meeting. The goals of the vertical articulation are to have a K-12 assessment system where the proficiency impact data is coordinated between the assessment systems for K-8 and high school. Further, where there are jumps in the impact data, the articulation should provide an explanation for why those jumps make sense. Essentially, the idea is that if a student is proficient at one Grade-Level he or she is ready to be proficient at the next higher Grade-Level.

With those goals in mind, the two contractors plan to set high school standards first. Maybe they will use ACT benchmarks and impact data to inform the high school standard-setting. Then, Grade-Level proficiency standards would be set within the context of the previously set EOC standards. At one point in the discussion, a rationale for higher percents proficient for EOCs was offered, noting that the focus of the course and the assessment for a single course might cover a narrower domain than the domain for an entire Grade-Level for a subject. These narrower domains might be more easily taught and mastered. The TAC noted that the contractors use different IRT models and that this might influence the selection of a response probability for standard-setting. The TAC further noted that there can be difficulties for standard-setting when a test is centered on the standards but not well centered on actual student achievement levels. The TAC concluded that while it is all right to have different response probability values for grades 3-8 and high school, the same response probabilities should be used across subjects and grades for grades 3-8 and across courses for End-of-Course tests in high school. The TAC was informed by DESE that they are working on a policy piece addressing these issues; and the TAC recommends that that policy piece be included in the proposal from the two contractors addressing vertical articulation of the two assessment systems.

EOC 2016 /17 results

DESE Commissioner Margie Vandeven joined the TAC for a discussion of whether the 2016/17 results for Algebra I and English II are valid for use in the state accountability program. The discussion was a continuation from the TAC conference call held in July. The concern is that student performance in these two subjects dropped substantially from the previous year. In its July conference call, the TAC concluded that this was a form effect, at least in part, rather than a true drop in performance. The TAC noted that when the same form was used two years prior as was used in 2017, performance was comparable (although somewhat higher). While it is common to use different forms over years for EOC tests, the forms were not built to ensure that they are psychometrically on the same scale. Importantly, 2017 is the last year for using the non-equated forms; in the future, EOC assessment forms will be carefully built so that the forms are not only reliable and valid but also comparable. At this point in time, results of the 2016/17 assessments have been reported to students, but their use for accountability has not yet happened. The interpretation of results of the other EOC assessments is not being called into question.

The TAC was informed that a performance index is built based on student proficiency levels with 1 point for below basic, 3 points for basic, 4 points for proficient and 5 points for advanced. Using this index, three measures are used for accountability: progress, value-added and status. The TAC noted that neither value-added nor progress requires that forms be comparable between 2016 and 2017 since value-added is metric free and progress is measured such that it reflects only change between two years ago and the current year (i.e., 2015 and 2017); for those two years the same form was used. The problem then is status.

Discussion identified six possible actions:

  1. Drop the 2016 results and use only 2015 and 2017 for which the same form was used.
  2. Hold schools and districts harmless, meaning that if a school or district does better in 2017, on results as reported they get credit for the improvement, otherwise they get the same accountability levels as for the prior year.
  3. Cut scores could be adjusted by, say, a single raw score point to make the results between 2016 and 2017 appear more comparable.
  4. Use equipercentile procedures, putting the form used in 2017 on the same scale as the form used in 2016
  5. Accept the results as reported to students and use the results for accountability accordingly.
  6. Exclude the results for Algebra I and English II in accountability and don’t report those results.

Of the above six possibilities, five members of the TAC favored the sixth option, not using the results for accountability and not reporting them. Two members of the TAC favored the first option; drop the 2016 results and use only 2015 and 2017 for which the same form was used. By the end of the meeting, however, all seven members of the TAC favored the sixth option. The TAC recommends that in the future, every effort be made to analyze the data before the results are reported to students so any problems can be detected and corrected.

Small-scale pilot update

Questar is developing new writing prompts and performance tasks. In the previous meeting, the TAC recommended that these be pilot tested and in a later TAC meeting, the TAC reviewed plans for a pilot test. At the time of that review, TAC strongly recommended that the pilot be online as is the operational test. At the current meeting, Questar reported that the pilot will take place on September 3; the pilot will be online and approximately 200 students reflecting a range of achievement levels will be used to pilot each prompt and performance task. There will be a different sample of students for each prompt and task. The TAC complimented Questar on these plans.

Stand-alone field test for science

Questar plans to build four forms each for Biology and Physical Science: two operational forms, one pre-test form, and one breach form for each assessment. Apparently, 95% of the students in the course will take the field test. There was some discussion as to whether or not students would be comparably motivated to do well on the field test as they will be motivated on the new operational test. There was some discussion as to whether or not items in the field test could include items from the item pool to estimate the size of motivation effects. Ultimately, the TAC was supportive of Questar’s plans for the field test.

Lexile® and Quantile® measures

Ellie Sanford from MetaMetrics led the TAC in a discussion of how Missouri might report student achievement in reading in Lexiles® and in mathematics in Quantiles®. DESE is interested in exploring these possibilities and asked that the TAC provide their thoughts on the pros and cons of doing this. The motivation for the initiative is to provide students and parents with new ways of thinking about student achievement levels and appropriate future instruction. These reporting metrics are based on the concept of prerequisites. Knowing what prerequisites students have mastered provides suggestions as to where their next instruction should be.

Sanford mentioned that one possibility was to embed MetaMetrics items in Missouri assessments so that the Missouri assessments could be reported on the MetaMetrics scales. Discussion revealed that currently many districts in Missouri are using other tests than those given by the state that report results in Lexiles® and Quantiles®. The TAC wondered whether it would be possible to use these data to link to state assessment results and put the state assessment results on the MetaMetrics scales rather than having the additional student burden of adding items to the state assessments to accomplish that end.

The TAC was uncertain as to DESE’s questions concerning MetaMetrics and its products. DESE will prepare questions for the TAC to address on these issues at a future meeting.

In closing

The next meeting of the TAC is scheduled for December 7 and 8, 2017, and the meeting after that for March 8 and 9, 2018. Yet a third meeting was scheduled for August 2 and 3, 2018.




Missouri End-of-Course Algebra I and English II Form Effect Considerations Presented by Questar Assessment Inc. August, 2017 5550 Upper 147 th Street West Apple Valley, MN 55124 (952) 997-2700 www.questarai.com MO EOC English II and Algebra I Forms Table of Contents 1. Introduction ................................................................................................................ 3 1.1. Purpose of the Document .............................................................................. 3 2. English II Results ....................................................................................................... 3 3. Algebra I Results........................................................................................................ 5 4. Historic Impact Data ................................................................................................... 7 List of Tables Table 1. Original and Adjusted RSS Tables: English II ................................................... 4 Table 2. Performance Results for English II .................................................................... 5 Table 3. Original and Adjusted RSS Tables: Algebra I .................................................... 5 Table 4. Performance Results for Algebra I .................................................................... 7 Table 5. Percent of Students at Each Performance Level: English II .............................. 7 Table 6. Percent of Students at Each Performance Level: Algebra I .............................. 8 2 MO EOC English II and Algebra I Forms 1. Introduction 1.1. Purpose of the Document The Missouri Department of Elementary and Secondary Education (DESE) convened a meeting with the Technical Advisory Committee (TAC) on July 27, 2017 to review the Spring 2017 performance results. Special attention was given to the English II and Algebra I results, both of which showed a decline of students in the Proficient and Advanced classification. Specifically, the percent of English II students achieving the Proficient and Advanced level was 71.3 for Spring 2017 compared to 80.8 for Spring 2016 (a decline of 9.5%) and the percent of Algebra I students achieving the Proficient and Advanced level was 60.4 for Spring 2017 compared to 67.4 for Spring 2016 (a decline of 7.0%). The TAC concluded that form effects were present for English II and Algebra I. Form H was administered in the Spring 2015 and 2017 and Form G was administered in the Spring 2016. The Spring 2017 results show a slight decline in students achieving the Proficient and Advanced level compared to the Spring 2015 results (a 3.2% decline for English II and a 2.4% decline for Algebra I). Both content areas were part of the recalibration study conducted last year. Although the recalibration results produced reasonable cut scores for English II, the cut scores for Algebra I were not reasonable, so no action was taken at that time. At the July meeting, TAC generally favored lowering the Proficient cut scores by one raw score point so students and districts are not disadvantaged by the form effect. Since the July meeting DESE has asked Questar to show the impact of lowering the cut score by one point at all performance levels. These results, presented by content area, show the Raw to Scale Score (RSS) tables, the frequency distribution of raw scores, and the impact data. Does the TAC recommend other score adjustment possibilities to consider? 2. English II Results Table 1 shows the original and adjusted RSS tables and the frequency distribution of raw scores. The performance level cut scores for English II are 182, 200, and 225 for Basic, Proficient, and Advanced, respectively. The scale scores and performance levels that were changed are in red text. Note that there are two raw scores that map to a scale score of 200 and two raw scores that map to a scale score of 225 on the adjusted RSS table. Table 2 presents the raw score range for each performance level, student counts, and percent of students at each performance level for the original and adjusted RSS tables. The percent of Proficient and Advanced students increased by almost four percent (71.3 to 75.2%) when the adjusted cut scores were applied. 3 MO EOC English II and Algebra I Forms Table 1. Original and Adjusted RSS Tables: English II Original RSS Table Adjusted RSS Table Freq Distribution Raw Score Scale Score St. Error Perf Level Scale Score St. Error Perf Level N Percent 0 105 28 1 105 28 1 11 0.02 1 124 16 1 124 16 1 0 0.00 2 135 11 1 135 11 1 0 0.00 3 142 9 1 142 9 1 6 0.01 4 146 8 1 146 8 1 7 0.01 5 150 7 1 150 7 1 15 0.02 6 153 7 1 153 7 1 21 0.03 7 156 6 1 156 6 1 35 0.06 8 159 6 1 159 6 1 72 0.12 9 161 6 1 161 6 1 66 0.11 10 163 6 1 163 6 1 114 0.19 11 165 6 1 165 6 1 147 0.24 12 167 6 1 167 6 1 224 0.36 13 169 6 1 169 6 1 273 0.44 14 172 6 1 172 6 1 346 0.56 15 174 6 1 174 6 1 411 0.67 16 176 6 1 176 6 1 499 0.81 17 178 6 1 178 6 1 576 0.94 18 179 6 1 182 6 2 753 1.22 19 182 6 2 182 6 2 869 1.41 20 183 5 2 183 5 2 1,014 1.65 21 185 5 2 185 5 2 1,175 1.91 22 187 5 2 187 5 2 1,296 2.10 23 189 5 2 189 5 2 1,508 2.45 24 191 5 2 191 5 2 1,781 2.89 25 193 6 2 193 6 2 1,842 2.99 26 195 6 2 195 6 2 2,187 3.55 27 197 6 2 200 6 3 2,426 3.94 28 200 6 3 200 6 3 2,602 4.22 29 201 6 3 201 6 3 2,789 4.53 30 203 6 3 203 6 3 3,175 5.15 31 205 6 3 205 6 3 3,293 5.35 32 208 6 3 208 6 3 3,526 5.72 33 210 6 3 210 6 3 3,698 6.00 34 212 6 3 212 6 3 3,796 6.16 35 215 6 3 215 6 3 3,920 6.36 36 218 7 3 218 7 3 3,737 6.07 37 221 7 3 225 7 4 3,612 5.86 38 225 7 4 225 7 4 3,102 5.04 39 227 8 4 227 8 4 2,524 4.10 40 232 8 4 232 8 4 1,792 2.91 41 236 9 4 236 9 4 1,237 2.01 42 242 10 4 242 10 4 657 1.07 43 250 12 4 250 12 4 324 0.53 44 250 16 4 250 16 4 114 0.19 4 MO EOC English II and Algebra I Forms Original RSS Table Adjusted RSS Table Freq Distribution Raw Score Scale Score St. Error Perf Level Scale Score St. Error Perf Level N Percent 45 250 29 4 250 29 4 22 0.04 Table 2. Performance Results for English II Performance Original RSS Results Adjusted RSS Results Level Raw Score N-Count Percent Raw Score N-Count Percent Below Basic 0–18 3,576 5.8 0–17 2,823 4.6 Basic 19–27 14,098 22.9 18–26 12,425 20.2 Proficient 28–37 34,148 55.4 27–36 32,962 53.5 Advanced 38–45 9,772 15.9 37–50 13,384 21.7 Below Basic + Basic 17,674 28.7 15,248 24.8 Proficient + Advanced 43,920 71.3 46,346 75.2 Total 61,594 100.0 61,594 100.0 3. Algebra I Results Table 3 shows the original and adjusted RSS tables and the frequency distribution of raw scores. The performance level cut scores for Algebra I are 187, 200, and 225 for Basic, Proficient, and Advanced, respectively. The scale scores and performance levels that were changed are in red text. Like the English II results, there are two raw scores that map to a scale score of 200 and two raw scores that map to a scale score of 225 on the adjusted RSS table. Table 4 presents the raw score range for each performance level, student counts, and percent of students at each performance level for the original and adjusted RSS tables. The percent of Proficient and Advanced students increased by almost four percent (60.4 to 64.2%) when the adjusted cut scores were applied. Table 3. Original and Adjusted RSS Tables: Algebra I Original RSS Table Adjusted RSS Table Freq Distribution Raw Score Scale Score St. Error Perf Level Scale Score St. Error Perf Level N Percent 0 100 40 1 100 40 1 10 0.02 1 107 22 1 107 22 1 8 0.01 2 124 16 1 124 16 1 6 0.01 3 134 14 1 134 14 1 6 0.01 4 142 12 1 142 12 1 20 0.03 5 148 11 1 148 11 1 56 0.09 6 153 10 1 153 10 1 106 0.17 7 158 10 1 158 10 1 212 0.35 5 MO EOC English II and Algebra I Forms Original RSS Table Adjusted RSS Table Freq Distribution Raw Score Scale Score St. Error Perf Level Scale Score St. Error Perf Level N Percent 8 162 9 1 162 9 1 362 0.60 9 166 9 1 166 9 1 548 0.90 10 169 8 1 169 8 1 792 1.31 11 172 8 1 172 8 1 1,059 1.75 12 175 8 1 175 8 1 1,337 2.21 13 178 8 1 178 8 1 1,620 2.67 14 181 7 1 181 7 1 1,980 3.27 15 183 7 1 183 7 1 2,031 3.35 16 186 7 1 187 7 2 2,239 3.70 17 188 7 2 188 7 2 2,331 3.85 18 190 7 2 190 7 2 2,322 3.83 19 193 7 2 193 7 2 2,298 3.79 20 195 7 2 195 7 2 2,367 3.91 21 197 7 2 200 7 3 2,283 3.77 22 200 7 3 200 7 3 2,315 3.82 23 201 7 3 201 7 3 2,185 3.61 24 203 7 3 203 7 3 2,131 3.52 25 205 7 3 205 7 3 2,080 3.43 26 207 6 3 207 6 3 2,022 3.34 27 209 6 3 209 6 3 2,035 3.36 28 211 6 3 211 6 3 1,953 3.22 29 213 6 3 213 6 3 1,880 3.10 30 215 6 3 215 6 3 1,770 2.92 31 217 6 3 217 6 3 1,834 3.03 32 219 6 3 219 6 3 1,791 2.96 33 221 7 3 221 7 3 1,543 2.55 34 223 7 3 225 7 4 1,530 2.53 35 225 7 4 225 7 4 1,395 2.30 36 227 7 4 227 7 4 1,349 2.23 37 229 7 4 229 7 4 1,278 2.11 38 231 7 4 231 7 4 1,179 1.95 39 234 7 4 234 7 4 1,109 1.83 40 236 8 4 236 8 4 1,020 1.68 41 239 8 4 239 8 4 930 1.54 42 242 8 4 242 8 4 803 1.33 43 245 9 4 245 9 4 666 1.10 44 249 9 4 249 9 4 563 0.93 45 250 10 4 250 10 4 460 0.76 46 250 11 4 250 11 4 317 0.52 47 250 13 4 250 13 4 223 0.37 48 250 16 4 250 16 4 129 0.21 6 MO EOC English II and Algebra I Forms Original RSS Table Adjusted RSS Table Freq Distribution Raw Score Scale Score St. Error Perf Level Scale Score St. Error Perf Level N Percent 49 250 22 4 250 22 4 77 0.13 50 250 39 4 250 39 4 22 0.04 Table 4. Performance Results for Algebra I Performance Original RSS Results Adjusted RSS Results Level Raw Score N-Count Percent Raw Score N-Count Percent Below Basic 0–16 12,392 20.5 0–15 10,153 16.8 Basic 17–21 11,601 19.1 16–20 11,557 19.1 Proficient 22–34 25,069 41.4 21–33 25,822 42.6 Advanced 35–50 11,520 19.0 34–50 13,050 21.5 Below Basic + Basic 23,993 39.6 21,710 35.8 Proficient + Advanced 36,589 60.4 38,872 64.2 Total 60,582 100.0 60,582 100.0 4. Historic Impact Data Tables 5 and 6 show the impact data from Fall 2014 to Spring 2017 for English II and Algebra I, respectively. The last column of each table presents the results for lowering each performance level cut score by one raw score point. Spring Form H and Spring Form G are highlighted in different colors to indicate form differences. Table 5. Percent of Students at Each Performance Level: English II 2014–2015 2015–2016 2016–2017 FALL SPR SUM FALL SPR SUM FALL SPR SPR Performance Level Form G Form H Form G Form H Form G Form G Form G Form H Adj Cuts Below Basic 14.1 5.0 21.2 22.6 3.2 20.2 16.0 5.8 4.6 Basic 30.1 20.5 38.6 29.7 16.0 36.4 28.0 22.9 20.2 Proficient 44.5 56.1 37.0 39.7 63.0 40.8 47.4 55.4 53.5 Advanced 11.3 18.4 3.3 8.0 17.8 2.5 8.6 15.9 21.7 Below Basic + Basic 44.2 25.5 59.8 52.3 19.2 56.7 44.0 28.7 24.8 Proficient + Advanced 55.8 74.5 40.3 47.7 80.8 43.3 56.0 71.3 75.2 Total 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 7 MO EOC English II and Algebra I Forms Table 6. Percent of Students at Each Performance Level: Algebra I 2014–2015 2015–2016 2016–2017 FALL SPR SUM FALL SPR SUM FALL SPR SPR Performance Level Form G Form H Form G Form H Form G Form G Form G Form H Adj Cuts Below Basic 23.6 18.1 21.4 35.8 13.0 20.2 24 20.5 16.8 Basic 24.5 19.0 23.5 20.2 19.6 27.1 25 19.1 19.1 Proficient 39.0 43.7 44.5 30.7 48.7 41.7 35 41.4 42.6 Advanced 12.9 19.1 10.6 13.4 18.7 11.1 16 19.0 21.5 Below Basic + Basic 48.1 37.1 44.9 56.0 32.6 47.2 48.4 39.6 35.8 Proficient + Advanced 51.9 62.8 55.1 44.1 67.4 52.8 51.6 60.4 64.2 Total 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 Does the TAC recommend making the raw score adjustment for the Spring 2017 English II and Algebra I tests? 8