Message: Additional EOC Materials

Additional EOC Materials
FromSireno, LisaDate  Friday, August 18, 2017 8:42 AM
To
Henningsen, Blaine;Preis, Stacey;Neale, Chris
Cc
SubjectAdditional EOC Materials

New documents regarding EOC for today’s discussions

 

Lisa Sireno | Standards and Assessment Administrator | Office of College and Career Readiness

Missouri Department of Elementary and Secondary Education | 573-751-3545 | dese.mo.gov

 


Missouri End-of-Course Algebra I and English II Form Effect Considerations Presented by Questar Assessment Inc. August, 2017 5550 Upper 147 th Street West Apple Valley, MN 55124 (952) 997-2700 www.questarai.com MO EOC English II and Algebra I Forms Table of Contents 1. Introduction ................................................................................................................ 3 1.1. Purpose of the Document .............................................................................. 3 2. English II Results ....................................................................................................... 3 3. Algebra I Results........................................................................................................ 5 4. Historic Impact Data ................................................................................................... 7 List of Tables Table 1. Original and Adjusted RSS Tables: English II ................................................... 4 Table 2. Performance Results for English II .................................................................... 5 Table 3. Original and Adjusted RSS Tables: Algebra I .................................................... 5 Table 4. Performance Results for Algebra I .................................................................... 7 Table 5. Percent of Students at Each Performance Level: English II .............................. 7 Table 6. Percent of Students at Each Performance Level: Algebra I .............................. 8 2 MO EOC English II and Algebra I Forms 1. Introduction 1.1. Purpose of the Document The Missouri Department of Elementary and Secondary Education (DESE) convened a meeting with the Technical Advisory Committee (TAC) on July 27, 2017 to review the Spring 2017 performance results. Special attention was given to the English II and Algebra I results, both of which showed a decline of students in the Proficient and Advanced classification. Specifically, the percent of English II students achieving the Proficient and Advanced level was 71.3 for Spring 2017 compared to 80.8 for Spring 2016 (a decline of 9.5%) and the percent of Algebra I students achieving the Proficient and Advanced level was 60.4 for Spring 2017 compared to 67.4 for Spring 2016 (a decline of 7.0%). The TAC concluded that form effects were present for English II and Algebra I. Form H was administered in the Spring 2015 and 2017 and Form G was administered in the Spring 2016. The Spring 2017 results show a slight decline in students achieving the Proficient and Advanced level compared to the Spring 2015 results (a 3.2% decline for English II and a 2.4% decline for Algebra I). Both content areas were part of the recalibration study conducted last year. Although the recalibration results produced reasonable cut scores for English II, the cut scores for Algebra I were not reasonable, so no action was taken at that time. At the July meeting, TAC generally favored lowering the Proficient cut scores by one raw score point so students and districts are not disadvantaged by the form effect. Since the July meeting DESE has asked Questar to show the impact of lowering the cut score by one point at all performance levels. These results, presented by content area, show the Raw to Scale Score (RSS) tables, the frequency distribution of raw scores, and the impact data. Does the TAC recommend other score adjustment possibilities to consider? 2. English II Results Table 1 shows the original and adjusted RSS tables and the frequency distribution of raw scores. The performance level cut scores for English II are 182, 200, and 225 for Basic, Proficient, and Advanced, respectively. The scale scores and performance levels that were changed are in red text. Note that there are two raw scores that map to a scale score of 200 and two raw scores that map to a scale score of 225 on the adjusted RSS table. Table 2 presents the raw score range for each performance level, student counts, and percent of students at each performance level for the original and adjusted RSS tables. The percent of Proficient and Advanced students increased by almost four percent (71.3 to 75.2%) when the adjusted cut scores were applied. 3 MO EOC English II and Algebra I Forms Table 1. Original and Adjusted RSS Tables: English II Original RSS Table Adjusted RSS Table Freq Distribution Raw Score Scale Score St. Error Perf Level Scale Score St. Error Perf Level N Percent 0 105 28 1 105 28 1 11 0.02 1 124 16 1 124 16 1 0 0.00 2 135 11 1 135 11 1 0 0.00 3 142 9 1 142 9 1 6 0.01 4 146 8 1 146 8 1 7 0.01 5 150 7 1 150 7 1 15 0.02 6 153 7 1 153 7 1 21 0.03 7 156 6 1 156 6 1 35 0.06 8 159 6 1 159 6 1 72 0.12 9 161 6 1 161 6 1 66 0.11 10 163 6 1 163 6 1 114 0.19 11 165 6 1 165 6 1 147 0.24 12 167 6 1 167 6 1 224 0.36 13 169 6 1 169 6 1 273 0.44 14 172 6 1 172 6 1 346 0.56 15 174 6 1 174 6 1 411 0.67 16 176 6 1 176 6 1 499 0.81 17 178 6 1 178 6 1 576 0.94 18 179 6 1 182 6 2 753 1.22 19 182 6 2 182 6 2 869 1.41 20 183 5 2 183 5 2 1,014 1.65 21 185 5 2 185 5 2 1,175 1.91 22 187 5 2 187 5 2 1,296 2.10 23 189 5 2 189 5 2 1,508 2.45 24 191 5 2 191 5 2 1,781 2.89 25 193 6 2 193 6 2 1,842 2.99 26 195 6 2 195 6 2 2,187 3.55 27 197 6 2 200 6 3 2,426 3.94 28 200 6 3 200 6 3 2,602 4.22 29 201 6 3 201 6 3 2,789 4.53 30 203 6 3 203 6 3 3,175 5.15 31 205 6 3 205 6 3 3,293 5.35 32 208 6 3 208 6 3 3,526 5.72 33 210 6 3 210 6 3 3,698 6.00 34 212 6 3 212 6 3 3,796 6.16 35 215 6 3 215 6 3 3,920 6.36 36 218 7 3 218 7 3 3,737 6.07 37 221 7 3 225 7 4 3,612 5.86 38 225 7 4 225 7 4 3,102 5.04 39 227 8 4 227 8 4 2,524 4.10 40 232 8 4 232 8 4 1,792 2.91 41 236 9 4 236 9 4 1,237 2.01 42 242 10 4 242 10 4 657 1.07 43 250 12 4 250 12 4 324 0.53 44 250 16 4 250 16 4 114 0.19 4 MO EOC English II and Algebra I Forms Original RSS Table Adjusted RSS Table Freq Distribution Raw Score Scale Score St. Error Perf Level Scale Score St. Error Perf Level N Percent 45 250 29 4 250 29 4 22 0.04 Table 2. Performance Results for English II Performance Original RSS Results Adjusted RSS Results Level Raw Score N-Count Percent Raw Score N-Count Percent Below Basic 0–18 3,576 5.8 0–17 2,823 4.6 Basic 19–27 14,098 22.9 18–26 12,425 20.2 Proficient 28–37 34,148 55.4 27–36 32,962 53.5 Advanced 38–45 9,772 15.9 37–50 13,384 21.7 Below Basic + Basic 17,674 28.7 15,248 24.8 Proficient + Advanced 43,920 71.3 46,346 75.2 Total 61,594 100.0 61,594 100.0 3. Algebra I Results Table 3 shows the original and adjusted RSS tables and the frequency distribution of raw scores. The performance level cut scores for Algebra I are 187, 200, and 225 for Basic, Proficient, and Advanced, respectively. The scale scores and performance levels that were changed are in red text. Like the English II results, there are two raw scores that map to a scale score of 200 and two raw scores that map to a scale score of 225 on the adjusted RSS table. Table 4 presents the raw score range for each performance level, student counts, and percent of students at each performance level for the original and adjusted RSS tables. The percent of Proficient and Advanced students increased by almost four percent (60.4 to 64.2%) when the adjusted cut scores were applied. Table 3. Original and Adjusted RSS Tables: Algebra I Original RSS Table Adjusted RSS Table Freq Distribution Raw Score Scale Score St. Error Perf Level Scale Score St. Error Perf Level N Percent 0 100 40 1 100 40 1 10 0.02 1 107 22 1 107 22 1 8 0.01 2 124 16 1 124 16 1 6 0.01 3 134 14 1 134 14 1 6 0.01 4 142 12 1 142 12 1 20 0.03 5 148 11 1 148 11 1 56 0.09 6 153 10 1 153 10 1 106 0.17 7 158 10 1 158 10 1 212 0.35 5 MO EOC English II and Algebra I Forms Original RSS Table Adjusted RSS Table Freq Distribution Raw Score Scale Score St. Error Perf Level Scale Score St. Error Perf Level N Percent 8 162 9 1 162 9 1 362 0.60 9 166 9 1 166 9 1 548 0.90 10 169 8 1 169 8 1 792 1.31 11 172 8 1 172 8 1 1,059 1.75 12 175 8 1 175 8 1 1,337 2.21 13 178 8 1 178 8 1 1,620 2.67 14 181 7 1 181 7 1 1,980 3.27 15 183 7 1 183 7 1 2,031 3.35 16 186 7 1 187 7 2 2,239 3.70 17 188 7 2 188 7 2 2,331 3.85 18 190 7 2 190 7 2 2,322 3.83 19 193 7 2 193 7 2 2,298 3.79 20 195 7 2 195 7 2 2,367 3.91 21 197 7 2 200 7 3 2,283 3.77 22 200 7 3 200 7 3 2,315 3.82 23 201 7 3 201 7 3 2,185 3.61 24 203 7 3 203 7 3 2,131 3.52 25 205 7 3 205 7 3 2,080 3.43 26 207 6 3 207 6 3 2,022 3.34 27 209 6 3 209 6 3 2,035 3.36 28 211 6 3 211 6 3 1,953 3.22 29 213 6 3 213 6 3 1,880 3.10 30 215 6 3 215 6 3 1,770 2.92 31 217 6 3 217 6 3 1,834 3.03 32 219 6 3 219 6 3 1,791 2.96 33 221 7 3 221 7 3 1,543 2.55 34 223 7 3 225 7 4 1,530 2.53 35 225 7 4 225 7 4 1,395 2.30 36 227 7 4 227 7 4 1,349 2.23 37 229 7 4 229 7 4 1,278 2.11 38 231 7 4 231 7 4 1,179 1.95 39 234 7 4 234 7 4 1,109 1.83 40 236 8 4 236 8 4 1,020 1.68 41 239 8 4 239 8 4 930 1.54 42 242 8 4 242 8 4 803 1.33 43 245 9 4 245 9 4 666 1.10 44 249 9 4 249 9 4 563 0.93 45 250 10 4 250 10 4 460 0.76 46 250 11 4 250 11 4 317 0.52 47 250 13 4 250 13 4 223 0.37 48 250 16 4 250 16 4 129 0.21 6 MO EOC English II and Algebra I Forms Original RSS Table Adjusted RSS Table Freq Distribution Raw Score Scale Score St. Error Perf Level Scale Score St. Error Perf Level N Percent 49 250 22 4 250 22 4 77 0.13 50 250 39 4 250 39 4 22 0.04 Table 4. Performance Results for Algebra I Performance Original RSS Results Adjusted RSS Results Level Raw Score N-Count Percent Raw Score N-Count Percent Below Basic 0–16 12,392 20.5 0–15 10,153 16.8 Basic 17–21 11,601 19.1 16–20 11,557 19.1 Proficient 22–34 25,069 41.4 21–33 25,822 42.6 Advanced 35–50 11,520 19.0 34–50 13,050 21.5 Below Basic + Basic 23,993 39.6 21,710 35.8 Proficient + Advanced 36,589 60.4 38,872 64.2 Total 60,582 100.0 60,582 100.0 4. Historic Impact Data Tables 5 and 6 show the impact data from Fall 2014 to Spring 2017 for English II and Algebra I, respectively. The last column of each table presents the results for lowering each performance level cut score by one raw score point. Spring Form H and Spring Form G are highlighted in different colors to indicate form differences. Table 5. Percent of Students at Each Performance Level: English II 2014–2015 2015–2016 2016–2017 FALL SPR SUM FALL SPR SUM FALL SPR SPR Performance Level Form G Form H Form G Form H Form G Form G Form G Form H Adj Cuts Below Basic 14.1 5.0 21.2 22.6 3.2 20.2 16.0 5.8 4.6 Basic 30.1 20.5 38.6 29.7 16.0 36.4 28.0 22.9 20.2 Proficient 44.5 56.1 37.0 39.7 63.0 40.8 47.4 55.4 53.5 Advanced 11.3 18.4 3.3 8.0 17.8 2.5 8.6 15.9 21.7 Below Basic + Basic 44.2 25.5 59.8 52.3 19.2 56.7 44.0 28.7 24.8 Proficient + Advanced 55.8 74.5 40.3 47.7 80.8 43.3 56.0 71.3 75.2 Total 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 7 MO EOC English II and Algebra I Forms Table 6. Percent of Students at Each Performance Level: Algebra I 2014–2015 2015–2016 2016–2017 FALL SPR SUM FALL SPR SUM FALL SPR SPR Performance Level Form G Form H Form G Form H Form G Form G Form G Form H Adj Cuts Below Basic 23.6 18.1 21.4 35.8 13.0 20.2 24 20.5 16.8 Basic 24.5 19.0 23.5 20.2 19.6 27.1 25 19.1 19.1 Proficient 39.0 43.7 44.5 30.7 48.7 41.7 35 41.4 42.6 Advanced 12.9 19.1 10.6 13.4 18.7 11.1 16 19.0 21.5 Below Basic + Basic 48.1 37.1 44.9 56.0 32.6 47.2 48.4 39.6 35.8 Proficient + Advanced 51.9 62.8 55.1 44.1 67.4 52.8 51.6 60.4 64.2 Total 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 Does the TAC recommend making the raw score adjustment for the Spring 2017 English II and Algebra I tests? 8
Stand Alone Field Test Plan for the New Missouri End-of-Course (MO EOC) Science Assessments Presented by Questar Assessment Inc. August, 2017 5550 Upper 147 th Street West Apple Valley, MN 55124 (952) 997-2700 www.questarai.com Table of Contents 1. Introduction ................................................................................................................ 1 1.1. Purpose of the Document .............................................................................. 1 1.2. Standalone Field Testing ............................................................................... 1 2. Test Designs .............................................................................................................. 2 3. Number of Forms ....................................................................................................... 2 3.1. Considerations ............................................................................................... 2 3.2. Recommendation for 2017–2018 standalone field testing ............................. 3 3.3. Recommendations for embedded field testing 2018–2019 and beyond ........ 3 4. Item Development Needs .......................................................................................... 4 4.1. Number of Required Operational Items ......................................................... 4 4.2. Overage Considerations ................................................................................ 5 5. Development Targets ................................................................................................. 5 6. Sample Size ............................................................................................................... 6 6.1. Considerations ............................................................................................... 6 6.2. Recommendation ........................................................................................... 6 7. Assigning Forms vs. Spiraling Forms ......................................................................... 7 7.1. Considerations ............................................................................................... 7 7.2. Recommendation ........................................................................................... 7 8. Item Placement Options ............................................................................................. 7 8.1. Standalone field test sequencing ................................................................... 7 8.2. Embedded FT Overview ................................................................................ 7 9. Summary ................................................................................................................... 8 10. References ................................................................................................................ 8 Appendix A: Proposed Science Blueprints ...................................................................... 9 Appendix B: Development Targets by Strand and Standard ......................................... 11 MO EOC Science SAFT Plan i List of Tables Table 2.a. MO EOC Test Designs Beginning 2018-2019—Biology and Physical Science ....................................................................................................................... 2 Table 2.b. MO EOC SAFT Test Design 2017–2018—Biology and Physical Science ..... 2 Table 3.a. 2017–2018 Student N-Counts—Biology and Physical Science ...................... 3 Table 3.b. Number of Operational Core and Embedded FT Forms—Biology and Physical Science ............................................................................................ 3 Table 4.a. Number of Unique Operational Points Needed for Four Core Forms—Biology and Physical Science ..................................................................................... 4 Table 5.a. Number of Science Items Authored by Item Type .......................................... 5 Table 5.b. FT Item Yields—Biology and Physical Science .............................................. 6 MO EOC Science SAFT Plan ii 1. Introduction 1.1. Purpose of the Document This document presents the plan for the Missouri End-of-Course (MO EOC) science standalone field tests (SAFTs) in Biology and Physical Science, and the subsequent embedded field tests (FTs) to be administered thereafter. The goal of field testing is to maintain an item bank in terms of quality and quantity of items that will allow for future operational forms construction. The SAFTs are intended to support the construction of two new operational cores, a new pre-test form, and a breach form for use in the Fall 2018 and Spring 2019 school year. The embedded FTs thereafter are intended for item bank maintenance. Best practice dictates that items be field tested before they are used operationally. Kirkpatrick and Way (2008) suggest that the following be considered when developing a field test plan: • How many new items should be developed, including overages for attrition? • How many students should take each item? • How many FT items should appear on each form? • How many test forms should include FT items? • How will item security be addressed (e.g., assigning forms to schools/districts vs. spiraling forms within classrooms)? • How will data be collected (e.g., standalone vs. embedded field testing)? 1.2. Standalone Field Testing Because the new EOC science Missouri Learning Standards (MLS) are significantly different from the previous standards, the Missouri Department of Elementary and Secondary Education (DESE) has indicated a preference for standalone field testing of new science content. There is a concern that field test (FT) items embedded within existing forms may be easily identifiable by students. TAC’s feedback is requested on the proposed plan – Is the plan complete? Are there any significant gaps that need to be addressed? MO EOC Science SAFT Plan 1 2. Test Designs In the 2017–2018 administration year, each science assessment included FT placeholder items. The Biology form also included a second session with an operational Performance Event (PE), while the Physical Science form had only one session which included only standalone items. Table 2.1 provides the planned test design for the new MO EOC science operational (OP) assessments that will be administered starting in 2018–2019. Test blueprints for each EOC science course are shown in Appendix A. Table 2.a. MO EOC Test Designs Beginning 2018-2019—Biology and Physical Science Assessment #OP Points # FT Items* # Internal Anchor Points Biology 50 10–12 12–15 Physical Science 50 10–12 12–15 * The number of FT items will be finalized after SAFT These new MO EOC science assessments will be comprised of approximately 10–12 FT items. The FT items will include standalone items and scenario sets. Both the standalone items and the items that are part of a scenario set will be comprised of traditional dichotomously-scored multiple-choice items as well as technology-enhanced items that may be dichotomously- or polytomously-scored. Unlike the new MO EOC Math assessments that include a 10-point Performance Event (PE), the science assessment scenario sets will not have point requirements. In addition to the FT items, the new science assessments will include linking items that represent approximately 30% of the operational points. These linking items are included as a means of evaluating the success of pre-equating, as well as an alternative to pre-equating if the pre-equating results are unacceptable. 1 Table 2.2 provides the planned test design for the MO EOC science SAFTs that will be administered 2017–2018. Table 2.b. MO EOC SAFT Test Design 2017–2018—Biology and Physical Science Assessment # Forms # Points # Internal Anchor Points Biology 5–6 40–50 8–15 Physical Science 5–6 40–50 8–15 Newly developed items will be distributed across the standalone FT forms by strands and standards. This means there will be representation of all blueprint strands on each form. However, each form will not match the exact strand point requirements or ranges. 3. Number of Forms 3.1. Considerations With standalone and embedded field testing, the more FT items that are included in each form means that fewer forms will be required overall. Conversely, if fewer FT items 1 The 2017–2018 equating plan discusses the use of linking items in more detail. MO EOC Science SAFT Plan 2 are included in each form, more forms will be required. There are several considerations when determining the number of FT items per form: • The total testing time available • The presence of either a single test section or multiple, separately timed sections • Student motivation and fatigue levels • Typical item-response latencies • Public opinion (e.g., parents not wanting their children to lose instructional time when field testing adds to assessment time requirements) 3.2. Recommendation for 2017–2018 standalone field testing Four to six standalone field tests are planned for each of the MO EOC science assessments. Table 3.a presents the Fall 2016 and Spring 2017 n-counts. Table 3.a. 2017–2018 Student N-Counts—Biology and Physical Science N-Counts Assessment Fall 2016 Spring 2017 Biology 2,899 61,957 Physical Science -- 2,940 Based on these n-counts, two Biology standalone field test forms are planned for administration in Fall 2017 with two to four additional forms being added to this count for administration in Spring 2018. For Physical Science, all four or five standalone field test forms are planned for administration in Spring 2018 only. The exact number of forms is dependent on the final development of new educator-authored items written during an Item Writer Workshop (IWW) held May 30 – June 2, 2017 in Lake Ozark. 3.3. Recommendations for embedded field testing 2018–2019 and beyond Based on the n-counts shown in Table 3.a, the following is planned for embedded field testing to begin Fall 2018 and Spring 2019. Table 3.b. Number of Operational Core and Embedded FT Forms—Biology and Physical Science Assessment # Operational Cores # Embedded FT Forms Fall 2018 Spring 2019 Assessment Fall 2018 Biology 2 2 2 8 Physical Science 1 2 1 4–5 MO EOC Science SAFT Plan 3 4. Item Development Needs There are two main considerations when determining item development needs: 1. The minimum number of new operational items needed to construct unique operational forms, as well as one pre-test and one breach form based on the test blueprint. 2. The overage rate is anticipated so that sufficient numbers of items are field tested (e.g., additional items needed to account for attrition losses due to poor item statistics and DIF). 4.1. Number of Required Operational Items For non-adaptive tests, the minimum number of new operational items will be the number of scored items on the test, less the planned number of reused items (e.g., the internal anchor items needed for a common-item equating design). Of course, more items may be developed and field tested. A new testing program may choose to create more items to immediately bolster its item pool. For security purposes, a program may have multiple testing windows, with each window needing its own unique operational test form. The number of FT items needs to increase accordingly in these cases. Table 4.a provides the approximate number of unique points needed to build four unique operational cores. Because the numbers of items on any core form will vary according to the point value of each item, point totals are shown in the table instead of item totals. This also aligns with the new test blueprints which are based on numbers of points. In addition, because linking sets for the two operational cores will be identified after content is developed, no linking item considerations are included in the needed point counts. Table 4.a. Number of Unique Operational Points Needed for Four Core Forms—Biology and Physical Science #OP Points Assessment Core 1 Core 2 Pre-Test Breach Total Linking Core Biology 50 10–15 35–40 50 50 185–190 Physical Science 50 10–15 35–40 50 50 185–190 MO EOC Science SAFT Plan 4 4.2. Overage Considerations Some FT items will be rejected because of poor performance. Once the new assessments become operational, it will be good practice to track the historical item loss trends and then adjust the item development plan to account for this attrition. Maintaining up-to-date records on the item pool (e.g., number of items by type and strand) facilitates the targeting of specific need areas during field test events. When multiple items are associated with a common stimulus (e.g., items are tied to a reading comprehension passage), survival rates for the entire bundle need to be considered. For assessments with detailed content specifications or multiple build specifications (e.g., statistical difficulty, total number of words, or reading level), planning for even greater overage is necessary to ensure that all build requirements can be met. Table 4.3 provides the item development needs assuming overages for standalone items and scenario set items. Table 4.3. Item Development Needs based on Overages—Biology and Physical Science Assessment #Standalone Points Needed # Scenario Sets Needed # Scenario Set Points Needed Biology 208–268 16 64–128 Physical Science 188–214 16 64–128 5. Development Targets Appendix B shows the item development plans used to determine item writing assignments given to educators at the Item Writer Workshop (IWW). Development numbers were based on blueprint strand and standard percentage targets, in addition to standalone and scenario set point considerations. The goal of each plan was to provide proportional representation of blueprint requirements. Table 5.a shows the number of items authored at the IWW to support the 2017–2018 SAFT. The corresponding number of points will be finalized as the content is developed throughout the summer and early fall. In addition, the number and type of items may also vary during the content development process. Table 5.a. Number of Science Items Authored by Item Type Course Choice Composite Drag and Drop Extended Text Gap Match Graphic Gap Match Hot Text Inline Choice Line Match Match Text Entry Total Biology 101 27 4 16 6 10 5 3 14 186 Physical Science 77 36 2 11 1 1 7 6 3 8 7 159 MO EOC Science SAFT Plan 5 About 99 Biology items and 68 Physical Science items were taken to a content and bias review meeting the last week of July 2017. All of the Biology items were reviewed by educators along with approximately 18 of the Physical Science items. The priority was on review of the Biology items to support the administration of two SAFTs in Fall 2017. Application of edits from the review are currently in progress. Table 5.1 provides the expected FT item yields from the science SAFTs assuming a 15% attrition rate. Table 5.b. FT Item Yields—Biology and Physical Science Assessment #Forms #FT points/ Form Total # Points Expected Yield Point Needs Biology 5–6 40–50 240–300 204–255 185–190 Physical Science 5–6 40–50 240–300 204–255 185–190 6. Sample Size 6.1. Considerations The number of students who should respond to each FT item depends on the considerations needed for the assessment. For example, if the only objective is to eliminate poorly performing items, a few hundred students would suffice. However, test developers may want to evaluate the performance of each multiple-choice item’s response option (e.g., constructing smoothed, nonparametric, and option response curves). Conducting DIF analysis is another common goal, with both the sample sizes for both the focal and reference groups needing to be considered. Test developers may also want to bank the item statistics like the IRT parameter estimates. Professionals have different perspectives on the minimum sample size requirement for different IRT models. For the Rasch and partial-credit models, the minimum n-count in some testing programs has been as low as 100 students for each possible FT item score point (for the FT item with the maximum point value). Various sources can contribute to student attrition (e.g., incomplete test documents), so adjusting expectations to account for an additional 100 students is prudent. It is always valuable if the actual n-counts are even larger than minimum required because the standard errors of the IRT parameter estimates will be smaller. 6.2. Recommendation Questar recommends a minimum of 500 responses per item. To account for student attrition, a minimum of 600 students per form will be targeted 2. In most cases, the form n-counts will be well above this threshold. 2 600 is Questar’s absolute minimum. Questar prefers n-counts of at least 1,300 students per form; but that would mean that some MO EOC assessments could not do any fall field testing of items. If pre-equating is used in the future, Questar would recommend targeting at least 3,000 students per form to allow for more precise item parameter estimates. MO EOC Science SAFT Plan 6 7. Assigning Forms vs. Spiraling Forms 7.1. Considerations The integrity of test data has received considerable attention over the last few years as cheating scandals at all educational levels have made national headlines. Item exposure is an important and multifaceted security issue. Considerations include the number of forms on which an item appears and the number of students who see an item. Where the items are used (e.g., schools and districts) can be controlled if this is deemed a significant security consideration. To limit the exposure of FT items across schools or districts, specific forms can be assigned 3 to these units. If a district administers only one test form that has a unique set of FT items, all other FT items would not be exposed in that district. In contrast, if all forms were spiraled to students within each classroom, all field test could potentially be exposed. Spiraling has advantages, such as creating randomly equivalent groups. However, whether the benefits of spiraling outweigh the security considerations is an important discussion point. 7.2. Recommendation Questar recommends spiraling the field test forms so that all items are administered in all schools and districts. Randomization has benefits regarding some statistical analyses often conducted on items. However, form assignment can be considered as a way to mitigate item-exposure risks should that ever become a significant security concern. 8. Item Placement Options 8.1. Standalone field test sequencing Items from each blueprint strand will be randomly placed on the standalone field test forms. Both the Biology and Physical Science SAFTs will consist of only one session that includes both standalone items and items that are part of a scenario set. Because the scenario sets may be more rigorous and time consuming than the standalone items, each form will most likely begin and end with standalone items to support access to the form and to avoid fatigue at the end of the form. 8.2. Embedded FT Overview The FT items within an operational test can be placed in various ways. These variations include the following: • Spread throughout the test (e.g., every fourth test item is an FT item) • Grouped together within a test or test session • Appended at the end of the test or the end of a test session 3 A concern regarding form assignment is that certain student subgroups will not be distributed across forms evenly, especially students with special needs. MO EOC Science SAFT Plan 7 Because the science assessments will include scenario sets, the location of FT items in embedded FT forms will grouped together rather than spread throughout the test or appended at the end of the test. On all MO EOC science assessments, the operational and field test item FT item positions will be held constant across all forms. Only the FT items will vary across test forms (i.e., the operational items will be the same on all test forms). 9. Summary The expected item yields based on 15% attrition from Fall 2017 and Spring 2018 will be the basis of an item pool aligned to the new Biology and Physical Science Missouri Learning Standards and facilitate the construction of two new unique operational forms, one pre-test, and one breach form for each MO EOC science assessment. The actual attrition rate could be lower or higher than 15%. Higher attrition rates would mean decreasing the number of forms or possibly increasing the linking set percentage across the two operational cores. The old item pool contains items written and aligned to the previous Missouri content standards. An alignment study would need to be conducted before any items could be used to fill gaps in core, pre-test, or breach forms for the 2018–2019 administration. Moreover, the item statistics for the items in the old item pool should be updated before constructing raw-to-scale score (RSS) tables via test-score pre-equating. There may also be concerns about exposure with existing items, as the original field testing for some of these items occurred as early as Spring 2008. 10. References Kirkpatrick, R., & Way, W.D. (2008). Field testing and equating designs for state educational assessments. Paper presented at the annual meeting of the American Educational Research Association, New York. MO EOC Science SAFT Plan 8 Appendix A: Proposed Blueprints Appendix A: Proposed Science Blueprints Appendix A.1. Biology Blueprint Reporting Category Concept Standards #Standards Point Range per Concept Range of Emphasis by Concept Point Range per Reporting Category Range of Emphasis by Reporting Category From Molecules to Organisms: Structure and Process Structure and Function 9-12.LS1.A.1-A.3 3 5—8 10—16% 11—15 22—30% Growth and Development of Organisms 9-12.LS1.B.1 1 2—5 4—10% Organization for Matter and Energy Flow in Organisms 9-12.LS1.C.1-C.3 3 2—5 4—10% Ecosystems: Interactions, Energy, and Dynamics Interdependent Relationships in Ecosystems 9-12.LS2.A.1 1 4—8 8—16% 8—12 16—24% Cycles of Matter and Energy Transfer in Ecosystems 9-12.LS2.B.1-B.3 3 2—6 4—12% Ecosystem Dynamics, Functioning and Resilience 9-12.LS2.C.1-C.2 2 1—4 2—8% Heredity: Inheritance and Variation of Traits Inheritance of Traits 9-12.LS3.A.1 1 3—6 6—12% 11—15 22—30% Variation of Traits 9-12.LS3.B.1-B.4 4 6—9 12—18% Biological Evolution: Unity and Diversity Evidence of Common Ancestry and Diversity 9-12.LS4.A.1-A.2 2 2—5 4—10% 11—15 22—30% Natural Selection 9-12.LS4.B.1-B.2 2 4—8 8—16% Adaptation 9-12.LS4.C.1-C.3 3 4—8 8—16% Earth and Human Activity Biogeology 9-12.ESS2.E.1 1 0—3 0—6% 3—6 6—12% Natural Resources 9-12.ESS3.A.1 1 0—3 0—6% Human Impacts on Earth’s Systems 9-12.ESS3.C.1-C.2 2 0—3 0—6% Global Climate Change 9-12.ESS3.D.2 1 0—3 0—6% Total 30 50 100% 50 100% MO EOC Science SAFT Plan 9 Appendix A: Proposed Blueprints Appendix A.2. Physical Science Blueprint Reporting Category Concept Standards #Standards Point Range per Concept Range of Emphasis by Concept Point Range per Reporting Category Range of Emphasis by Reporting Category Matter and Its Interactions Structure and Properties of Matter 9-12.PS1.A.1-A.3, A.5 4 6—9 12—18% 12—16 24—32% Chemical Reactions 9-12.PS1.B.1-B.3 3 2—6 4—12% Nuclear Process 9-12.PS1.C.1 1 1—3 2—6% Motion and Stability: Forces and Interactions Forces and Motion 9-12.PS2.A.1-A.3 3 6—9 12—18% 12—16 24—32% Types of Interactions 9-12.PS2.B.1-B.2 2 4—7 8—14% Energy Definitions of Energy 9-12.PS3.A.1-A.2 2 1—4 2—8% 12—16 24—32% Conservation of Energy and Energy Transfer 9-12.PS3.B.1 1 4—7 8—14% Relationships Between Energy and Forces 9-12.PS3.C.1 1 4—7 8—14% Wave Properties 9-12.PS4.A.1-A.2 2 1—4 2—8% Earth and the Universe The Universe and Its Stars 9-12.ESS1.A.1-A.2 2 0—3 0—6% 6—9 12—18% Earth and the Solar System 9-12.ESS1.B.1 1 0—3 0—6% Earth Materials and Systems 9-12.ESS2.A.3-A.4 2 0—3 0—6% Total 24 50 100% 50 100% MO EOC Science SAFT Plan 10 Appendix B: Development Targets Appendix B: Development Targets by Strand and Standard Table B.1. Biology New Development Target Number of Points Strand Sub Strand Standards New Development # Scenario Sets # Set Points Min # Standalone Pts Max # Standalone Pts From Molecules to Organisms: Structure and Process Structure and Function 9-12.LS1.A.1-A.3 4 16-32 24 28 Growth and Development of Organisms 9-12.LS1.B 12 16 Organization for Matter and Energy Flow in Organisms 9-12.LS1.C.1-C.2 12 16 Ecosystems: Interactions, Energy, and Dynamics Interdependent Relationships in Ecosystems 9-12.LS2.A.1 4 16-32 22 26 Cycles of Matter and Energy Transfer in Ecosystems 9-12.LS2.B.1-B.3 14 18 Ecosystem Dynamics, Functioning and Resilience 9-12.LS2.C.1-C.2 8 12 Heredity: Inheritance and Variation of Traits Inheritance of 9-12.LS3.A.1 4 16-32 16 20 Variation of Traits 9-12.LS3.B.1-B.4 28 32 Biological Evolution: Unity and Diversity Evidence of Common Ancestry and Diversity 9-12.LS4.A.1-A.2 4 16-32 12 16 Natural Selection 9-12.LS4.B.1-B.2 22 26 Adaptation 9-12.LS4.C.1-C.3 22 26 Earth and Human Activity Biogeology 9-12.ESS2.E 0 0 4 8 Natural resources 9-12.ESS3.A.1 4 8 Human Impacts on Earth 9-12.ESS3.C.1-C.2 4 8 Global Climate Change 9-12.ESS3.D.2 4 8 Total 16 64-128 208 268 Notes: Scenario Sets: Ideally 4 scenario sets per form, ranging from 4 to 8 points each; 1 for each Life Science Reporting Category; composing 32% of each test form. Standalone Items: Range from 1 to 3 points each but most worth 1 or 2 points; single point, multi-point, or part A, part B; objective or TE; composing 68% of each test form. Spring 2017 MO EOC Field Test Plan 11 Appendix B: Development Targets Table B.2. Physical Science New Development Target Number of Points Strand Sub Strand Standards New Development # Scenario Sets # Set Points Min # Standalone Pts Max # Standalone Pts Matter and Its Interactions Structure and Properties of Matter 9-12.PS1.A.1-A.3, A.5 6 24-48 30 32 Chemical Reactions 9-12.PS1.B.1-B.3 18 20 Nuclear Process 9-12.PS1.C.1 8 10 Motion and Stability: Forces and Interactions Forces and Motion 9-12.PS2.A.1-A.3 5 20-40 30 32 Types of Interactions 9-12.PS2.B.1-B.2 22 24 Energy Definitions of Energy 9-12.PS3.A.1-A.2 5 20-40 10 12 Conservation of Energy and Energy Transfer 9-12.PS3.B.1 22 24 Relationships Between Energy and Forces 9-12.PS3.C.1 22 24 Wave Properties 9-12.PS4.A.1-A.2 8 12 Earth and the Universe The Universe and Its Stars 9-12.ESS1.A.1-A.2 0 0 6 8 Earth and the Solar System 9-12.ESS1.B.1 6 8 Earth Materials and Systems 9-12.ESS2.A.3-A.4 6 8 Total 16 64-128 188 214 Notes: Scenario Sets: Ideally 4 scenario sets per form, ranging from 4 to 8 points each; 1 for each Life Science Reporting Category; composing 32% of each test form. Standalone Items: Range from 1 to 3 points each but most worth 1 or 2 points; single point, multi-point, or part A, part B; objective or TE; composing 68% of each test form. Spring 2017 MO EOC Field Test Plan 12
2017–2018 Pre-Test Plan for the New Missouri End-of-Course (MO EOC) Assessments Presented by Questar Assessment Inc. August, 2017 5550 Upper 147 th Street West Apple Valley, MN 55124 (952) 997-2700 www.questarai.com MO EOC Pre-Test Plan i Table of Contents 1. Introduction ................................................................................................................ 1 1.1. Purpose of the Document .............................................................................. 1 2. Pre-Tests by Course .................................................................................................. 1 2.1 . Pre-Test Overview ......................................................................................... 1 2.2. Pre-Test Design ............................................................................................. 1 2.3. Pre-Test Scoring ............................................................................................ 2 MO EOC Pre-Test Plan ii List of Tables Table 2.a. Pre-Tests by Course for 2017–2018 Administration ....................................... 1 Table 2.b Scoring Methods for EOC Pre-Tests—All Courses ......................................... 2 MO EOC Pre-Test Plan 1 | P a g e 1. Introduction 1.1. Purpose of the Document This document presents the pre-test plan for the Missouri End-of-Course (EOC) assessments in English I, English II, Algebra I, Algebra II, Geometry, Biology, Physical Science, American History, and Government for 2017–2018. TAC’s feedback is requested on the pretest plan. Is it complete? Are there any significant gaps? 2. Pre-Tests by Course 2.1. Pre-Test Overview Table 2.a provides a brief overview of the pre-tests for each course for the 2017–2108 administration. Table 2.a. Pre-Tests by Course for 2017–2018 Administration Course Type #Forms Description English I New 1 Pre-test constructed from newly authored content aligned to the new EOC Missouri Learning Standards and the new Assessment Blueprint English II New 1 Pre-test constructed from newly authored content aligned to the new EOC Missouri Learning Standards and the new Assessment Blueprint Algebra I New 1 Pre-test constructed from newly authored content aligned to the new EOC Missouri Learning Standards and the new Assessment Blueprint Algebra II New 1 Pre-test constructed from newly authored content aligned to the new EOC Missouri Learning Standards and the new Assessment Blueprint Geometry New 1 Pre-test constructed from newly authored content aligned to the new EOC Missouri Learning Standards and the new Assessment Blueprint Biology None 0 Standalone field testing only for 2017–2018 Physical Science None 0 Standalone field testing only for 2017–2018 American History Re-Use 1 Re-use of the previous pre-test aligned to previous EOC standards and blueprint, but delivered on the new test delivery system Government Re-Use 1 Re-use of the previous pre-test aligned to previous EOC standards and blueprint, but delivered on the new test delivery system 2.2. Pre-Test Design Each new pre-test will be constructed per the course assessment blueprint specifications for strands, standards, and numbers of points. Content for the new pre-tests will come from items authored by Missouri educators during an Item Writer Workshop that took place May 30 to June 2, 2017 in Lake Ozark, MO. The developed items were taken to a content and bias review meeting and reviewed by Missouri educators the last week of July 2017. The application of edits to these items are currently in progress. MO EOC Pre-Test Plan 2 | P a g e A separate writing prompt will need to be developed for the pretest since all prompts need to be associated with a Reading passage within the form. There will not be time to get this prompt reviewed by educators before appearing on the pretest form. 2.3. Pre-Test Scoring Teachers will see scored responses for machine-scored items on the pre-tests. Since the American History and Government assessments consist solely of multiple-choice items, those tests will be fully machine-scored. The EOC English I, English II, Algebra I, Algebra II, and Geometry assessments will include a combination of machine-scored, hand scored, and hybrid scored item types. Scoring guides will be provided to teachers so that they can score any hand-scored items. Table 2.b describes the scoring method for each assessment. Table 2.b Scoring Methods for EOC Pre-Tests—All Courses Course #Sessions Scoring Methods English I 2 Session 1: machine-scored multiple-choice and technology-enhanced items Session 2: hand-scored writing prompt English II 2 Session 1: machine-scored multiple-choice and technology-enhanced items Session 2: hand-scored writing prompt Algebra I 2 Session 1: combination of machine-scored items, hybrid items, and hand-scored items Session 2: hand-scored Performance Event which may include some machine-scored items Algebra II 2 Session 1: combination of machine-scored items, hybrid items, and hand-scored items Session 2: hand-scored Performance Event which may include some machine-scored items Geometry 2 Session 1: combination of machine-scored items, hybrid items, and hand-scored items Session 2: hand-scored Performance Event which may include some machine-scored items Biology n/a No pre-test will be administered for 2017–2018, only standalone field testing will take place. Physical Science n/a No pre-test will be administered for 2017–2018, only standalone field testing will take place. American History 1 One session only: machine-scored multiple-choice items Government 1 One session only: machine-scored multiple-choice items
Scoring Guidelines for Multi-Point Items on the New Missouri End-of-Course (MO EOC) Assessments Presented by Questar Assessment Inc. August, 2017 5550 Upper 147 th Street West Apple Valley, MN 55124 (952) 997-2700 www.questarai.com Table of Contents 1. Introduction ................................................................................................................ 1 1.1. Purpose of the Document .............................................................................. 1 2. Description of Item Types .......................................................................................... 1 2.1. Choice Items .................................................................................................. 1 2.2. Machine-Scored Technology-Enhanced Items (TEIs) ................................... 1 2.3. Hand-scored or Hybrid-Scored TEIs .............................................................. 2 2.4. Composite Items, Writing Prompts, and Stimuli Sets ..................................... 2 3. List of Item Types ....................................................................................................... 2 4. Item Type Samples .................................................................................................... 3 5. Scoring Guidelines for Multi-Point Items .................................................................... 7 Appendix A: Proposed Scoring Guidelines ...................................................................... 8 MO EOC Field Test Plan i List of Tables Table 3.a Item Types on the EOC Assessments ............................................................. 3 Table 4.a Sample Item Types for the EOC Assessments ............................................... 3 MO EOC Field Test Plan ii Appendix A: Scoring 1. Introduction 1.1. Purpose of the Document This document presents an overview of the different item types that will be on the new Missouri End-of-Course (EOC) Assessments in English I, English II, Algebra I, Algebra II, Geometry, Biology, Physical Science, American History, and Government. Some of the item types described were introduced as field test items on the EOC English I, English II, Algebra I, Algebra II, and Geometry Assessments in Spring 2017. All item types described will be available for administration in Fall 2017 and Spring 2018 on the ELA and Math assessments listed previously, as well as on the Biology and Physical Science 2017–2018 standalone field tests. Additional item types will appear on future EOC ELA, Math, and Science assessments as they are developed and available for authoring. These item types will also be used on the upcoming new EOC American History and Government assessments to be field tested in 2018–2019. In discussion with the Department of Elementary and Secondary Education (DESE), it was determined that students should be able to be awarded partial credit for technology-enhanced items (TEIs), as opposed to dichotomous scoring of items. TAC’s feedback is requested on the scoring guidelines and relevant research related to partial credit scoring. 2. Description of Item Types 2.1. Choice Items Choice items include traditional multiple-choice (MC) item types as well as multiple-select item types (MS). Choice items include a stem, viable response options, and may or may not be accompanied by stimuli (such as art, a table, a chart, a graph, or other visual representation). MC items will have four answer options, only one correct answer, and a point value of 0 or 1. MS items may have five to six answer options, two or more correct answer options, and values of 0, 1, or 2 points, generally. Three-point MS items will be evaluated on a case-by-case basis. Since students can earn partial credit, a scoring rubric must be created for each MS item. 2.2. Machine-Scored Technology-Enhanced Items (TEIs) Machine-scored technology-enhanced items are items that have innovative computer-based interactions which require students to engage in the construction of their own answers. Like Choice items, TEIs will include a stem, response options, and may or may not be accompanied by stimuli, however, the number and format of response options will vary by item type. In general, TEIs will also range in point values from 0 to 2 points, and three-point items will be evaluated on a case-by-case basis. Scoring rubrics will be created for any multi-point items. The following is a list of the different machine-scored TEIs that may appear MO EOC Multi-Point Scoring Guidelines 1 | Page Appendix A: Scoring on the Fall 2017 and Spring 2018 EOC Assessments: drag and drop; gap match; graphic gap match; hot text; inline choice; line match; and match. 2.3. Hand-scored or Hybrid-Scored TEIs Hand-scored or hybrid-scored TEIs involve responses that students key in or construct on a grid. Standalone items include text entry, line graphing, and free draw grid interactions. Extended text items can appear as a standalone item, but is usually associated with a stimulus. Text entry and graphing items will generally range in values from 0 to 2 points, with three-point items evaluated on a case-by-case basis. However, the point values for extended text items may range from 0 to 10 points depending on how it is used. All multi-point text entry and graphing items will have a scoring rubric similar to the ones used for Choice and machine-scored TEIs, while the rubric for extended text items may involve more complex, multi-trait scoring information. 2.4. Composite Items, Writing Prompts, and Stimuli Sets Composite items, writing prompts, and stimuli sets are made up of the previously described item types. However, they are formatted or combined to create a special item type or item set. • A composite item has multiple interactions, has more than one item part, or is part of a performance event or scenario set. • A passage set is a set of reading items associated to one or more informational or literary passages. Passage sets appear on the EOC English I and English II assessments. • A performance event is a set of items associated to a stimulus. Performance events are worth 10 points and appear only on the EOC Algebra I, Algebra II, and Geometry assessments. • Scenario sets are item sets associated to one or more stimuli. They have no point value requirements and will appear on the EOC Biology and Physical Science assessments. • Writing prompts are made up of an extended text interaction associated to two or more passages. Prompts are worth 10 points and appear on the EOC English I and English II assessments. Passage sets, performance events, scenario sets, and writing prompts all appear split screen online. The stimuli or passage appears on the left side of the screen, while the item appears on the right side of the screen. Students advance through items in the set as they would with standalone items. All standalone items appear full screen online. 3. List of Item Types Table 3.a lists each item type and a brief description of its functionality and scoring method. MO EOC Multi-Point Scoring Guidelines 2 | Page Appendix A: Scoring Table 3.a Item Types on the EOC Assessments Interaction Description Scoring Method Choice MC or MS item—selected response Machine Scored Drag and Drop Drag text to columns Machine Scored Extended Text Open-ended response Hand Scored Free Draw Grid Graph or draw linear, quadratic, exponential, or logarithmic functions Hand Scored Gap Match Drag text into a sentence or phrase Machine Scored Graphic Gap Match Drag text or a graphic anywhere on the screen Machine Scored Hot Text Select text: word, phrase, sentence Machine Scored Inline Choice Select drop-down options Machine Scored Line Match Use lines to match text in columns or rows Machine Scored Line Graphing Graph lines, line segments, rays on grid Hand Scored* Match Check cells in a table—multi-select table Machine Scored Text Entry Enter numeric answer or short answer response Hybrid * The scoring method for the Line Graphing interaction will be machine-scored in the future 4. Item Type Samples Table 4.a shows the different items types along with a sample. Table 4.a Sample Item Types for the EOC Assessments Interaction /Item Type Sample Choice MO EOC Multi-Point Scoring Guidelines 3 | Page Appendix A: Scoring Drag and Drop Extended Text NO Equation Editor Palette: With Equation Editor Palette: MO EOC Multi-Point Scoring Guidelines 4 | Page Appendix A: Scoring Free Draw Grid Gap Match Graphic Gap Match MO EOC Multi-Point Scoring Guidelines 5 | Page Appendix A: Scoring Line Graph Hot Text Inline Choice Line Match MO EOC Multi-Point Scoring Guidelines 6 | Page Appendix A: Scoring Match Text Entry NO Equation Editor Palette: With Equation Editor Palette: 5. Scoring Guidelines for Multi-Point Items Appendix A shows the proposed scoring guidelines for each item type. Scoring rules may vary across the different EOC content areas. English I and English II have specific rules that allow for students to earn partial credit on machine-scored TEIs when incorrect keys are included in a student response. Based on the field test data results, MO EOC Multi-Point Scoring Guidelines 7 | Page Appendix A: Scoring Guidelines Appendix A: Proposed Scoring Guidelines Appendix A. Proposed Scoring Guidelines for Multi-Point Items on the EOC Assessments Item Type/Interaction Type Scoring Rule Guidelines Directions & Response Selection Limitations Choice MC Items Point Range: 1 Answer Options & Key Ranges: 4 options, 1 key Points Scenario: Key (1, 2, 3, 4) = 1 point Sample direction lines: n/a Response limitations: 1 selection only Choice MS Items Point Range: 1–3 points Answer Options & Key Ranges: Max 5-7 answer options, Max 2-4 keys Sample Point Scenario for 5 answer options with 3 keys (e.g. 1,3,5): 2 points: 1,3,5 1 point: 1,3; 1,5; 3,5; 0 points: any other key combinations ELA Rule: Sample Point scenario for 6 answer options with 3 keys (e.g. 1,3,5): 2 points: 1,3,5 (three keys ONLY) 1 point: 1,3; 1,5; 3,5 (two keys), OR 1,2,3; 1,2,3,4; 1,2,3,4,6 .... (two keys AND one, two and/or three incorrect answers), OR 1,2,3,5; 1,3,4,5; 1,3,5,6 (three keys + one incorrect answer) 0 points: 1; 2; 3; 4; 5; 6 (0-1 keys), OR 1; 3; 5; 1,2; 1,2,4; 1,2,4,6; ...... (0-1 keys and other combinations), OR 1,2,3,4,5,6 (all answer choices selected) ELA: Sample Point Scenario for x answer options with 2 keys (e.g. 1,3): 2 points: 1,3 (two keys ONLY) 1 point: 1; 3; 1,2; 1,2,4; 1,2,4,6 ..... (0-1 keys and any other combination except all answer choices), OR 0 points: 2; 4; 5; 6; 2,4; 4,6; 2,4,6; .... (no keys) 1,2,3,4,5,6 (all available choices selected) Sample direction lines: Select all that apply. Response limitations: No limitations on the number of responses students can select. MO EOC Multi-Point Scoring Guidelines 8 | Page Appendix A: Scoring Guidelines Item Type/Interaction Type Scoring Rule Guidelines Directions & Response Selection Limitations Composite Items (Muli-part) or Performance Event Point Range: 1–3 points; Exception: Performance Events (10 points) Part & Key Ranges: No more than 2 parts in an item; key ranges as outlined for each interaction type Dependent Scoring Sample Points Scenario for Composite Item: 2 points: Part A correct; Part B correct 1 point: Part A correct; Part B incorrect 0 points: Part B correct only; Part A and B incorrect Performance Event Scoring: Item level scoring for standalone items and dependent items Group level scoring Sample direction lines: The following question has two parts. First, answer Part A. Then, answer Part B. Response limitations: as outlined for each item interaction type Extended Text Point Range: 2–4 points; Exception: Writing Prompt (10 points) Answer Requirements: Sample Answer (Exemplar) and Rubric Same guidelines for point possible scenarios as outlined for Choice MC or MS items. Writing Prompt: Use Scoring Rubric Sample direction lines: n/a Response limitations: EE palette with NO keyboard lockdown for Math assessments; No palette for ELA assessments Free Draw Grid Point Range: 2–4 points Graph and Key Ranges: Max 20 X 20 grid; Max 4 functions, rays or segments: Max 10 points; Max 1 inequality Answer Requirements: Solution/Sample Answer (Exemplar) and Rubric that includes point considerations for title, axis and scale labels Same guidelines for point possible scenarios as outlined for Choice MC or MS items. Sample direction lines: n/a Response limitations: as described in the graph and key ranges Drag and Drop Point Range: 1–3 points Premise/Response & Key Ranges: Max 6 draggers/dropbays; Max 3-4 keys Same guidelines for point possible scenarios as outlined for Choice MC or MS items. Sample direction lines: n/a Response limitations: up to 6 drop bays generally MO EOC Multi-Point Scoring Guidelines 9 | Page Appendix A: Scoring Guidelines Item Type/Interaction Type Scoring Rule Guidelines Directions & Response Selection Limitations Gap Match Point Range: 1–3 points Premise/Response & Key Ranges: Max 6 draggers/dropbays; Max 3-4 keys Same guidelines for point possible scenarios as outlined for Choice MC or MS items. Sample direction lines: n/a Response limitations: up to 6 drop bays generally Graphic Gap Match Point Range: 1–3 points Premise/Response & Key Ranges: Max 6 draggers/dropbays; Max 3-4 keys Same guidelines for point possible scenarios as outlined for Choice MC or MS items. Sample direction lines: n/a Response limitations: up to 6 drop bays generally Hot Text Point Range: 1–3 points Selectable Text & Key Ranges: Max 6 - 7 selectable text options; Max 3-4 keys Same guidelines for point possible scenarios as outlined for Choice MC or MS items. Sample direction lines: "Select the ….” Response limitations: no limit on the number of options students can select Inline Choice (drop-down) Point Range: 1–3 points Drop-Down & Answer Option Ranges: Max 4 drop-downs, Max 3-4 answer options p/drop-down Same guidelines for point possible scenarios as outlined for Choice MC or MS items. Sample direction lines: "Select the ….” “Select the … from the dropdown lists that ….” “Select the words that correctly complete the sentences.” “Complete the following statement about … by selecting the correct words.” Response limitations: n/a MO EOC Multi-Point Scoring Guidelines 10 | Page Appendix A: Scoring Guidelines Item Type/Interaction Type Scoring Rule Guidelines Directions & Response Selection Limitations Line Graphing Point Range: 1–3 points Graph and Key Ranges: Max 20 X 20 grid; Max 4 lines, rays or segments: Max 10 points; Max 1 inequality Answer Requirements: Solution/Sample Answer (Exemplar) and Rubric Same guidelines for point possible scenarios as outlined for Choice MC or MS items. Sample direction lines: n/a Response limitations: as described in the graph and key ranges Line Match Point Range: 1–3 points Premise/Response & Key Ranges: Max 6 premises/responses; Max 3-4 keys Same guidelines for point possible scenarios as outlined for Choice MC or MS items. Sample direction lines: n/a Response limitations: one to one or one to many as appropriate Match (MS Table) Point Range: 1–3 points Table & Key Ranges: 6 X 6 table with max 6 keys Same guidelines for point possible scenarios as outlined for Choice MC or MS items. Sample direction lines: Select all that apply. Response limitations: No limitations on the number of responses students can select, unless makes sense to limit by column or row. Text Entry Point Range: 1–3 points Text Box and Key Ranges: Max 4 boxes; numeric responses or one word/short answers Same guidelines for point possible scenarios as outlined for Choice MC or MS items. Sample direction lines: n/a Response limitations: EE palette and/or keyboard lockdown MO EOC Multi-Point Scoring Guidelines 11 | Page
Small Scale Pilot Plan for the Missouri End-of-Course (MO EOC) English and Mathematics Assessments Presented by Questar Assessment Inc. August, 2017 5550 Upper 147 th Street West Apple Valley, MN 55124 (952) 997-2700 www.questarai.com Table of Contents 1. Introduction ................................................................................................................ 1 1.1. Purpose of the Document .............................................................................. 1 2. Pilot Testing ............................................................................................................... 1 2.1. District Selection Options ............................................................................... 1 3. Additional Considerations .......................................................................................... 3 i List of Tables Table 2.1. Sample Requirements by Content Area ......................................................... 1 Table 3.1. Sample Requirements by District Performance .............................................. 2 ii MO EOC Small Scale Pilot Plan 1. Introduction 1.1. Purpose of the Document This document presents the small scale pilot plan for the Fall 2017 Missouri End-of-Course (EOC) pilot tests for English I, English II, Algebra I, Algebra II, and Geometry. Pilot testing is necessary to test new writing prompts (WPs) for English I and English II and new performance events (PEs) for Algebra I, Algebra II, and Geometry. The new PE/WPs, along with the field test items embedded in the Spring 2017 forms, will support the construction of new operational forms that will be introduced in Fall 2017 and Spring 2018. During the April 2017 planning meeting, DESE requested that Questar Assessment Inc. (Questar) identify school districts to target for the pilot testing. The initial plan was to conduct the full pilot study for English I and English II and the pre-pilot study for Algebra I, Algebra II, and Geometry in May 2017 and the full pilot study for the Mathematics assessments in September 2017. However, the English pilot and Mathematics pre-pilot tests did not take place in May, so the pilot study for all content areas is scheduled for the week of September 11, 2017. 2. Pilot Testing Table 2.1 summarizes the pilot testing sample requirements. As shown in the table, five WPs for English I and English II will be pilot tested for a total of 10 WPs. A sample of 2,000 students will be recruited based on the requirement of 200 students per prompt. Ten PEs across the three Mathematics assessments will be pilot tested. Two PEs for Algebra I and four PEs for Algebra II and Geometry will be piloted. The sample requirements are 300 students per PE. Table 2.1. Sample Requirements by Content Area Content Area Fall 2017 Pilot #PE/WPs N per PE/WP Total Sample English I 5 200 1,000 English II 5 200 1,000 Algebra I 2 300 600 Algebra II 4 300 1,200 Geometry 4 300 1,200 TAC’s feedback is requested on the proposed sample requirements for the pilot test. 2.1. District Selection Options A representative sample of students is necessary to ensure that the pilot results reflect the student population in terms of student performance and demographic characteristics. Data from the Spring 2016 administration were used to classify districts as low-, middle-, and high-performing based on the district mean scale score. One-third 1 MO EOC Small Scale Pilot Plan of the districts was classified as low-performing, one-third was classified as middle-performing, and one-third was classified as high-performing. Questar provided DESE with an Excel file that includes the state population, and district and school performance and demographic data for each content area. The data include: • district code and name • student count • median scale score • mean scale score • scale score classification (1 for low, 2 for middle, 3 for high) • percent pass and fail • percent by gender • percent by ethnicity • percent for Title 1 • percent receiving Free and Reduced Lunch • percent with Individual Education Plan (IEP) • percent with Limited English Proficient (LEP) • percent receiving accommodations Questar proposes that DESE choose from the following options for selecting districts, as appropriate for the content-specific pilot requirements. Questar pre-selected districts based on two methods. 1. Select one large, average-performing district that is demographically similar to the population. 2. Select one or more districts from a sample set provided by Questar (i.e., a sample of five or so districts) that are low-, middle-, and high-performing to achieve a heterogeneous sample and demographic representation. These districts were identified in the Excel file by colored highlighting. Districts with fewer than 25 students were not considered. Ethnic representation was based on the top three ethnicities: White, African American, and Hispanic. Table 3.1 presents the sample requirements by district performance. Table 2.2. Sample Requirements by District Performance Fall 2017 Pilot Content Area Low-Performing Middle-Performing High-Performing Total Sample English I 330–335 330–335 330–335 1,000 English II 330–335 330–335 330–335 1,000 Algebra I 200 200 200 600 Algebra II 400 400 400 1,200 Geometry 400 400 400 1,200 2 MO EOC Small Scale Pilot Plan TAC’s feedback is requested on the proposed sampling options for selecting districts to participate in the pilot test. 3. Additional Considerations Questar recommends spiraling the pilot forms at the student level to obtain a random sample of students taking each form. Some districts indicated their interest in participating in the Spring 2017 pilot study. Willing participation has been obtained to date. Does the TAC see any significant gaps in the sampling proposal? 3