mirror of
https://git.yoctoproject.org/poky
synced 2026-02-21 17:09:42 +01:00
REXML is an XML toolkit for Ruby. The REXML gem before 3.3.9 has a ReDoS vulnerability when it parses an XML that has many digits between &# and x...; in a hex numeric character reference (&#x.... This does not happen with Ruby 3.2 or later. Ruby 3.1 is the only affected maintained Ruby. The REXML gem 3.3.9 or later include the patch to fix the vulnerability. CVE-2024-49761-0009.patch is the CVE fix and rest are dependent commits. Reference: https://nvd.nist.gov/vuln/detail/CVE-2024-49761 Upstream-patch:810d22852383ca5c4b0f51217dbcc67e4049f6a6fc6cad570b7712855547370666e314a579730f25ce59f2eb1a(From OE-Core rev: 5b453400e9dd878b81b1447d14b3f518809de17e) Signed-off-by: Divya Chellam <divya.chellam@windriver.com> Signed-off-by: Steve Sakoman <steve@sakoman.com>
562 lines
22 KiB
Diff
562 lines
22 KiB
Diff
From 370666e314816b57ecd5878e757224c3b6bc93f5 Mon Sep 17 00:00:00 2001
|
|
From: NAITOH Jun <naitoh@gmail.com>
|
|
Date: Tue, 27 Feb 2024 09:48:35 +0900
|
|
Subject: [PATCH] Use more StringScanner based API to parse XML (#114)
|
|
|
|
## Why?
|
|
|
|
Improve maintainability by optimizing the process so that the parsing
|
|
process proceeds using StringScanner#scan.
|
|
|
|
## Changed
|
|
- Change `REXML::Parsers::BaseParser` from `frozen_string_literal:
|
|
false` to `frozen_string_literal: true`.
|
|
- Added `Source#string=` method for error message output.
|
|
- Added TestParseDocumentTypeDeclaration#test_no_name test case.
|
|
- Of the `intSubset` of DOCTYPE, "<!" added consideration for processing
|
|
`Comments` that begin with "<!".
|
|
|
|
## [Benchmark]
|
|
|
|
```
|
|
RUBYLIB= BUNDLER_ORIG_RUBYLIB= /Users/naitoh/.rbenv/versions/3.3.0/bin/ruby -v -S benchmark-driver /Users/naitoh/ghq/github.com/naitoh/rexml/benchmark/parse.yaml
|
|
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin22]
|
|
Calculating -------------------------------------
|
|
before after before(YJIT) after(YJIT)
|
|
dom 11.240 10.569 17.173 18.219 i/s - 100.000 times in 8.896882s 9.461267s 5.823007s 5.488884s
|
|
sax 31.812 30.716 48.383 52.532 i/s - 100.000 times in 3.143500s 3.255655s 2.066861s 1.903600s
|
|
pull 36.855 36.354 56.718 61.443 i/s - 100.000 times in 2.713300s 2.750693s 1.763099s 1.627523s
|
|
stream 34.176 34.758 49.801 54.622 i/s - 100.000 times in 2.925991s 2.877065s 2.008003s 1.830779s
|
|
|
|
Comparison:
|
|
dom
|
|
after(YJIT): 18.2 i/s
|
|
before(YJIT): 17.2 i/s - 1.06x slower
|
|
before: 11.2 i/s - 1.62x slower
|
|
after: 10.6 i/s - 1.72x slower
|
|
|
|
sax
|
|
after(YJIT): 52.5 i/s
|
|
before(YJIT): 48.4 i/s - 1.09x slower
|
|
before: 31.8 i/s - 1.65x slower
|
|
after: 30.7 i/s - 1.71x slower
|
|
|
|
pull
|
|
after(YJIT): 61.4 i/s
|
|
before(YJIT): 56.7 i/s - 1.08x slower
|
|
before: 36.9 i/s - 1.67x slower
|
|
after: 36.4 i/s - 1.69x slower
|
|
|
|
stream
|
|
after(YJIT): 54.6 i/s
|
|
before(YJIT): 49.8 i/s - 1.10x slower
|
|
after: 34.8 i/s - 1.57x slower
|
|
before: 34.2 i/s - 1.60x slower
|
|
|
|
```
|
|
|
|
- YJIT=ON : 1.06x - 1.10x faster
|
|
- YJIT=OFF : 0.94x - 1.01x faster
|
|
|
|
---------
|
|
|
|
Co-authored-by: Sutou Kouhei <kou@clear-code.com>
|
|
|
|
CVE: CVE-2024-49761
|
|
|
|
Upstream-Status: Backport [https://github.com/ruby/rexml/commit/370666e314816b57ecd5878e757224c3b6bc93f5]
|
|
|
|
Signed-off-by: Divya Chellam <divya.chellam@windriver.com>
|
|
---
|
|
.../lib/rexml/parsers/baseparser.rb | 325 +++++++++---------
|
|
.bundle/gems/rexml-3.2.5/lib/rexml/source.rb | 31 +-
|
|
2 files changed, 188 insertions(+), 168 deletions(-)
|
|
|
|
diff --git a/.bundle/gems/rexml-3.2.5/lib/rexml/parsers/baseparser.rb b/.bundle/gems/rexml-3.2.5/lib/rexml/parsers/baseparser.rb
|
|
index 595669c..bc59bcd 100644
|
|
--- a/.bundle/gems/rexml-3.2.5/lib/rexml/parsers/baseparser.rb
|
|
+++ b/.bundle/gems/rexml-3.2.5/lib/rexml/parsers/baseparser.rb
|
|
@@ -1,4 +1,4 @@
|
|
-# frozen_string_literal: false
|
|
+# frozen_string_literal: true
|
|
require_relative '../parseexception'
|
|
require_relative '../undefinednamespaceexception'
|
|
require_relative '../source'
|
|
@@ -112,6 +112,19 @@ module REXML
|
|
"apos" => [/'/, "'", "'", /'/]
|
|
}
|
|
|
|
+ module Private
|
|
+ INSTRUCTION_END = /#{NAME}(\s+.*?)?\?>/um
|
|
+ TAG_PATTERN = /((?>#{QNAME_STR}))/um
|
|
+ CLOSE_PATTERN = /(#{QNAME_STR})\s*>/um
|
|
+ ATTLISTDECL_END = /\s+#{NAME}(?:#{ATTDEF})*\s*>/um
|
|
+ NAME_PATTERN = /\s*#{NAME}/um
|
|
+ GEDECL_PATTERN = "\\s+#{NAME}\\s+#{ENTITYDEF}\\s*>"
|
|
+ PEDECL_PATTERN = "\\s+(%)\\s+#{NAME}\\s+#{PEDEF}\\s*>"
|
|
+ ENTITYDECL_PATTERN = /(?:#{GEDECL_PATTERN})|(?:#{PEDECL_PATTERN})/um
|
|
+ end
|
|
+ private_constant :Private
|
|
+ include Private
|
|
+
|
|
def initialize( source )
|
|
self.stream = source
|
|
@listeners = []
|
|
@@ -198,183 +211,172 @@ module REXML
|
|
#STDERR.puts @source.encoding
|
|
#STDERR.puts "BUFFER = #{@source.buffer.inspect}"
|
|
if @document_status == nil
|
|
- word = @source.match( /\A((?:\s+)|(?:<[^>]*>))/um )
|
|
- word = word[1] unless word.nil?
|
|
- #STDERR.puts "WORD = #{word.inspect}"
|
|
- case word
|
|
- when COMMENT_START
|
|
- return [ :comment, @source.match( COMMENT_PATTERN, true )[1] ]
|
|
- when XMLDECL_START
|
|
- #STDERR.puts "XMLDECL"
|
|
- results = @source.match( XMLDECL_PATTERN, true )[1]
|
|
- version = VERSION.match( results )
|
|
- version = version[1] unless version.nil?
|
|
- encoding = ENCODING.match(results)
|
|
- encoding = encoding[1] unless encoding.nil?
|
|
- if need_source_encoding_update?(encoding)
|
|
- @source.encoding = encoding
|
|
- end
|
|
- if encoding.nil? and /\AUTF-16(?:BE|LE)\z/i =~ @source.encoding
|
|
- encoding = "UTF-16"
|
|
- end
|
|
- standalone = STANDALONE.match(results)
|
|
- standalone = standalone[1] unless standalone.nil?
|
|
- return [ :xmldecl, version, encoding, standalone ]
|
|
- when INSTRUCTION_START
|
|
+ if @source.match("<?", true)
|
|
return process_instruction
|
|
- when DOCTYPE_START
|
|
- base_error_message = "Malformed DOCTYPE"
|
|
- @source.match(DOCTYPE_START, true)
|
|
- @nsstack.unshift(curr_ns=Set.new)
|
|
- name = parse_name(base_error_message)
|
|
- if @source.match(/\A\s*\[/um, true)
|
|
- id = [nil, nil, nil]
|
|
- @document_status = :in_doctype
|
|
- elsif @source.match(/\A\s*>/um, true)
|
|
- id = [nil, nil, nil]
|
|
- @document_status = :after_doctype
|
|
- else
|
|
- id = parse_id(base_error_message,
|
|
- accept_external_id: true,
|
|
- accept_public_id: false)
|
|
- if id[0] == "SYSTEM"
|
|
- # For backward compatibility
|
|
- id[1], id[2] = id[2], nil
|
|
+ elsif @source.match("<!", true)
|
|
+ if @source.match("--", true)
|
|
+ return [ :comment, @source.match(/(.*?)-->/um, true)[1] ]
|
|
+ elsif @source.match("DOCTYPE", true)
|
|
+ base_error_message = "Malformed DOCTYPE"
|
|
+ unless @source.match(/\s+/um, true)
|
|
+ if @source.match(">")
|
|
+ message = "#{base_error_message}: name is missing"
|
|
+ else
|
|
+ message = "#{base_error_message}: invalid name"
|
|
+ end
|
|
+ @source.string = "<!DOCTYPE" + @source.buffer
|
|
+ raise REXML::ParseException.new(message, @source)
|
|
end
|
|
- if @source.match(/\A\s*\[/um, true)
|
|
+ @nsstack.unshift(curr_ns=Set.new)
|
|
+ name = parse_name(base_error_message)
|
|
+ if @source.match(/\s*\[/um, true)
|
|
+ id = [nil, nil, nil]
|
|
@document_status = :in_doctype
|
|
- elsif @source.match(/\A\s*>/um, true)
|
|
+ elsif @source.match(/\s*>/um, true)
|
|
+ id = [nil, nil, nil]
|
|
@document_status = :after_doctype
|
|
else
|
|
- message = "#{base_error_message}: garbage after external ID"
|
|
- raise REXML::ParseException.new(message, @source)
|
|
+ id = parse_id(base_error_message,
|
|
+ accept_external_id: true,
|
|
+ accept_public_id: false)
|
|
+ if id[0] == "SYSTEM"
|
|
+ # For backward compatibility
|
|
+ id[1], id[2] = id[2], nil
|
|
+ end
|
|
+ if @source.match(/\s*\[/um, true)
|
|
+ @document_status = :in_doctype
|
|
+ elsif @source.match(/\s*>/um, true)
|
|
+ @document_status = :after_doctype
|
|
+ else
|
|
+ message = "#{base_error_message}: garbage after external ID"
|
|
+ raise REXML::ParseException.new(message, @source)
|
|
+ end
|
|
end
|
|
- end
|
|
- args = [:start_doctype, name, *id]
|
|
- if @document_status == :after_doctype
|
|
- @source.match(/\A\s*/um, true)
|
|
- @stack << [ :end_doctype ]
|
|
- end
|
|
- return args
|
|
- when /\A\s+/
|
|
- else
|
|
- @document_status = :after_doctype
|
|
- if @source.encoding == "UTF-8"
|
|
- @source.buffer_encoding = ::Encoding::UTF_8
|
|
+ args = [:start_doctype, name, *id]
|
|
+ if @document_status == :after_doctype
|
|
+ @source.match(/\s*/um, true)
|
|
+ @stack << [ :end_doctype ]
|
|
+ end
|
|
+ return args
|
|
+ else
|
|
+ message = "Invalid XML"
|
|
+ raise REXML::ParseException.new(message, @source)
|
|
end
|
|
end
|
|
end
|
|
if @document_status == :in_doctype
|
|
- md = @source.match(/\A\s*(.*?>)/um)
|
|
- case md[1]
|
|
- when SYSTEMENTITY
|
|
- match = @source.match( SYSTEMENTITY, true )[1]
|
|
- return [ :externalentity, match ]
|
|
-
|
|
- when ELEMENTDECL_START
|
|
- return [ :elementdecl, @source.match( ELEMENTDECL_PATTERN, true )[1] ]
|
|
-
|
|
- when ENTITY_START
|
|
- match = [:entitydecl, *@source.match( ENTITYDECL, true ).captures.compact]
|
|
- ref = false
|
|
- if match[1] == '%'
|
|
- ref = true
|
|
- match.delete_at 1
|
|
- end
|
|
- # Now we have to sort out what kind of entity reference this is
|
|
- if match[2] == 'SYSTEM'
|
|
- # External reference
|
|
- match[3] = match[3][1..-2] # PUBID
|
|
- match.delete_at(4) if match.size > 4 # Chop out NDATA decl
|
|
- # match is [ :entity, name, SYSTEM, pubid(, ndata)? ]
|
|
- elsif match[2] == 'PUBLIC'
|
|
- # External reference
|
|
- match[3] = match[3][1..-2] # PUBID
|
|
- match[4] = match[4][1..-2] # HREF
|
|
- match.delete_at(5) if match.size > 5 # Chop out NDATA decl
|
|
- # match is [ :entity, name, PUBLIC, pubid, href(, ndata)? ]
|
|
- else
|
|
- match[2] = match[2][1..-2]
|
|
- match.pop if match.size == 4
|
|
- # match is [ :entity, name, value ]
|
|
- end
|
|
- match << '%' if ref
|
|
- return match
|
|
- when ATTLISTDECL_START
|
|
- md = @source.match( ATTLISTDECL_PATTERN, true )
|
|
- raise REXML::ParseException.new( "Bad ATTLIST declaration!", @source ) if md.nil?
|
|
- element = md[1]
|
|
- contents = md[0]
|
|
-
|
|
- pairs = {}
|
|
- values = md[0].scan( ATTDEF_RE )
|
|
- values.each do |attdef|
|
|
- unless attdef[3] == "#IMPLIED"
|
|
- attdef.compact!
|
|
- val = attdef[3]
|
|
- val = attdef[4] if val == "#FIXED "
|
|
- pairs[attdef[0]] = val
|
|
- if attdef[0] =~ /^xmlns:(.*)/
|
|
- @nsstack[0] << $1
|
|
- end
|
|
+ @source.match(/\s*/um, true) # skip spaces
|
|
+ if @source.match("<!", true)
|
|
+ if @source.match("ELEMENT", true)
|
|
+ md = @source.match(/(.*?)>/um, true)
|
|
+ raise REXML::ParseException.new( "Bad ELEMENT declaration!", @source ) if md.nil?
|
|
+ return [ :elementdecl, "<!ELEMENT" + md[1] ]
|
|
+ elsif @source.match("ENTITY", true)
|
|
+ match = [:entitydecl, *@source.match(ENTITYDECL_PATTERN, true).captures.compact]
|
|
+ ref = false
|
|
+ if match[1] == '%'
|
|
+ ref = true
|
|
+ match.delete_at 1
|
|
end
|
|
- end
|
|
- return [ :attlistdecl, element, pairs, contents ]
|
|
- when NOTATIONDECL_START
|
|
- base_error_message = "Malformed notation declaration"
|
|
- unless @source.match(/\A\s*<!NOTATION\s+/um, true)
|
|
- if @source.match(/\A\s*<!NOTATION\s*>/um)
|
|
- message = "#{base_error_message}: name is missing"
|
|
+ # Now we have to sort out what kind of entity reference this is
|
|
+ if match[2] == 'SYSTEM'
|
|
+ # External reference
|
|
+ match[3] = match[3][1..-2] # PUBID
|
|
+ match.delete_at(4) if match.size > 4 # Chop out NDATA decl
|
|
+ # match is [ :entity, name, SYSTEM, pubid(, ndata)? ]
|
|
+ elsif match[2] == 'PUBLIC'
|
|
+ # External reference
|
|
+ match[3] = match[3][1..-2] # PUBID
|
|
+ match[4] = match[4][1..-2] # HREF
|
|
+ match.delete_at(5) if match.size > 5 # Chop out NDATA decl
|
|
+ # match is [ :entity, name, PUBLIC, pubid, href(, ndata)? ]
|
|
else
|
|
- message = "#{base_error_message}: invalid declaration name"
|
|
+ match[2] = match[2][1..-2]
|
|
+ match.pop if match.size == 4
|
|
+ # match is [ :entity, name, value ]
|
|
end
|
|
- raise REXML::ParseException.new(message, @source)
|
|
- end
|
|
- name = parse_name(base_error_message)
|
|
- id = parse_id(base_error_message,
|
|
- accept_external_id: true,
|
|
- accept_public_id: true)
|
|
- unless @source.match(/\A\s*>/um, true)
|
|
- message = "#{base_error_message}: garbage before end >"
|
|
- raise REXML::ParseException.new(message, @source)
|
|
+ match << '%' if ref
|
|
+ return match
|
|
+ elsif @source.match("ATTLIST", true)
|
|
+ md = @source.match(ATTLISTDECL_END, true)
|
|
+ raise REXML::ParseException.new( "Bad ATTLIST declaration!", @source ) if md.nil?
|
|
+ element = md[1]
|
|
+ contents = md[0]
|
|
+
|
|
+ pairs = {}
|
|
+ values = md[0].scan( ATTDEF_RE )
|
|
+ values.each do |attdef|
|
|
+ unless attdef[3] == "#IMPLIED"
|
|
+ attdef.compact!
|
|
+ val = attdef[3]
|
|
+ val = attdef[4] if val == "#FIXED "
|
|
+ pairs[attdef[0]] = val
|
|
+ if attdef[0] =~ /^xmlns:(.*)/
|
|
+ @nsstack[0] << $1
|
|
+ end
|
|
+ end
|
|
+ end
|
|
+ return [ :attlistdecl, element, pairs, contents ]
|
|
+ elsif @source.match("NOTATION", true)
|
|
+ base_error_message = "Malformed notation declaration"
|
|
+ unless @source.match(/\s+/um, true)
|
|
+ if @source.match(">")
|
|
+ message = "#{base_error_message}: name is missing"
|
|
+ else
|
|
+ message = "#{base_error_message}: invalid name"
|
|
+ end
|
|
+ @source.string = " <!NOTATION" + @source.buffer
|
|
+ raise REXML::ParseException.new(message, @source)
|
|
+ end
|
|
+ name = parse_name(base_error_message)
|
|
+ id = parse_id(base_error_message,
|
|
+ accept_external_id: true,
|
|
+ accept_public_id: true)
|
|
+ unless @source.match(/\s*>/um, true)
|
|
+ message = "#{base_error_message}: garbage before end >"
|
|
+ raise REXML::ParseException.new(message, @source)
|
|
+ end
|
|
+ return [:notationdecl, name, *id]
|
|
+ elsif md = @source.match(/--(.*?)-->/um, true)
|
|
+ case md[1]
|
|
+ when /--/, /-\z/
|
|
+ raise REXML::ParseException.new("Malformed comment", @source)
|
|
+ end
|
|
+ return [ :comment, md[1] ] if md
|
|
end
|
|
- return [:notationdecl, name, *id]
|
|
- when DOCTYPE_END
|
|
+ elsif match = @source.match(/(%.*?;)\s*/um, true)
|
|
+ return [ :externalentity, match[1] ]
|
|
+ elsif @source.match(/\]\s*>/um, true)
|
|
@document_status = :after_doctype
|
|
- @source.match( DOCTYPE_END, true )
|
|
return [ :end_doctype ]
|
|
end
|
|
end
|
|
if @document_status == :after_doctype
|
|
- @source.match(/\A\s*/um, true)
|
|
+ @source.match(/\s*/um, true)
|
|
end
|
|
begin
|
|
- next_data = @source.buffer
|
|
- if next_data.size < 2
|
|
- @source.read
|
|
- next_data = @source.buffer
|
|
- end
|
|
- if next_data[0] == ?<
|
|
- if next_data[1] == ?/
|
|
+ if @source.match("<", true)
|
|
+ if @source.match("/", true)
|
|
@nsstack.shift
|
|
last_tag = @tags.pop
|
|
- md = @source.match( CLOSE_MATCH, true )
|
|
+ md = @source.match(CLOSE_PATTERN, true)
|
|
if md and !last_tag
|
|
message = "Unexpected top-level end tag (got '#{md[1]}')"
|
|
raise REXML::ParseException.new(message, @source)
|
|
end
|
|
if md.nil? or last_tag != md[1]
|
|
message = "Missing end tag for '#{last_tag}'"
|
|
- message << " (got '#{md[1]}')" if md
|
|
+ message += " (got '#{md[1]}')" if md
|
|
+ @source.string = "</" + @source.buffer if md.nil?
|
|
raise REXML::ParseException.new(message, @source)
|
|
end
|
|
return [ :end_element, last_tag ]
|
|
- elsif next_data[1] == ?!
|
|
- md = @source.match(/\A(\s*[^>]*>)/um)
|
|
+ elsif @source.match("!", true)
|
|
+ md = @source.match(/([^>]*>)/um)
|
|
#STDERR.puts "SOURCE BUFFER = #{source.buffer}, #{source.buffer.size}"
|
|
raise REXML::ParseException.new("Malformed node", @source) unless md
|
|
- if md[0][2] == ?-
|
|
- md = @source.match( COMMENT_PATTERN, true )
|
|
+ if md[0][0] == ?-
|
|
+ md = @source.match(/--(.*?)-->/um, true)
|
|
|
|
case md[1]
|
|
when /--/, /-\z/
|
|
@@ -383,17 +385,18 @@ module REXML
|
|
|
|
return [ :comment, md[1] ] if md
|
|
else
|
|
- md = @source.match( CDATA_PATTERN, true )
|
|
+ md = @source.match(/\[CDATA\[(.*?)\]\]>/um, true)
|
|
return [ :cdata, md[1] ] if md
|
|
end
|
|
raise REXML::ParseException.new( "Declarations can only occur "+
|
|
"in the doctype declaration.", @source)
|
|
- elsif next_data[1] == ??
|
|
+ elsif @source.match("?", true)
|
|
return process_instruction
|
|
else
|
|
# Get the next tag
|
|
- md = @source.match(TAG_MATCH, true)
|
|
+ md = @source.match(TAG_PATTERN, true)
|
|
unless md
|
|
+ @source.string = "<" + @source.buffer
|
|
raise REXML::ParseException.new("malformed XML: missing tag start", @source)
|
|
end
|
|
tag = md[1]
|
|
@@ -418,7 +421,7 @@ module REXML
|
|
return [ :start_element, tag, attributes ]
|
|
end
|
|
else
|
|
- md = @source.match( TEXT_PATTERN, true )
|
|
+ md = @source.match(/([^<]*)/um, true)
|
|
text = md[1]
|
|
return [ :text, text ]
|
|
end
|
|
@@ -462,8 +465,7 @@ module REXML
|
|
|
|
# Unescapes all possible entities
|
|
def unnormalize( string, entities=nil, filter=nil )
|
|
- rv = string.clone
|
|
- rv.gsub!( /\r\n?/, "\n" )
|
|
+ rv = string.gsub( /\r\n?/, "\n" )
|
|
matches = rv.scan( REFERENCE_RE )
|
|
return rv if matches.size == 0
|
|
rv.gsub!( /�*((?:\d+)|(?:x[a-fA-F0-9]+));/ ) {
|
|
@@ -498,9 +500,9 @@ module REXML
|
|
end
|
|
|
|
def parse_name(base_error_message)
|
|
- md = @source.match(/\A\s*#{NAME}/um, true)
|
|
+ md = @source.match(NAME_PATTERN, true)
|
|
unless md
|
|
- if @source.match(/\A\s*\S/um)
|
|
+ if @source.match(/\s*\S/um)
|
|
message = "#{base_error_message}: invalid name"
|
|
else
|
|
message = "#{base_error_message}: name is missing"
|
|
@@ -577,11 +579,28 @@ module REXML
|
|
end
|
|
|
|
def process_instruction
|
|
- match_data = @source.match(INSTRUCTION_PATTERN, true)
|
|
+ match_data = @source.match(INSTRUCTION_END, true)
|
|
unless match_data
|
|
message = "Invalid processing instruction node"
|
|
+ @source.string = "<?" + @source.buffer
|
|
raise REXML::ParseException.new(message, @source)
|
|
end
|
|
+ if @document_status.nil? and match_data[1] == "xml"
|
|
+ content = match_data[2]
|
|
+ version = VERSION.match(content)
|
|
+ version = version[1] unless version.nil?
|
|
+ encoding = ENCODING.match(content)
|
|
+ encoding = encoding[1] unless encoding.nil?
|
|
+ if need_source_encoding_update?(encoding)
|
|
+ @source.encoding = encoding
|
|
+ end
|
|
+ if encoding.nil? and /\AUTF-16(?:BE|LE)\z/i =~ @source.encoding
|
|
+ encoding = "UTF-16"
|
|
+ end
|
|
+ standalone = STANDALONE.match(content)
|
|
+ standalone = standalone[1] unless standalone.nil?
|
|
+ return [ :xmldecl, version, encoding, standalone ]
|
|
+ end
|
|
[:processing_instruction, match_data[1], match_data[2]]
|
|
end
|
|
|
|
diff --git a/.bundle/gems/rexml-3.2.5/lib/rexml/source.rb b/.bundle/gems/rexml-3.2.5/lib/rexml/source.rb
|
|
index db78a12..4111d1d 100644
|
|
--- a/.bundle/gems/rexml-3.2.5/lib/rexml/source.rb
|
|
+++ b/.bundle/gems/rexml-3.2.5/lib/rexml/source.rb
|
|
@@ -76,6 +76,10 @@ module REXML
|
|
end
|
|
end
|
|
|
|
+ def string=(string)
|
|
+ @scanner.string = string
|
|
+ end
|
|
+
|
|
# @return true if the Source is exhausted
|
|
def empty?
|
|
@scanner.eos?
|
|
@@ -150,28 +154,25 @@ module REXML
|
|
def read
|
|
begin
|
|
@scanner << readline
|
|
+ true
|
|
rescue Exception, NameError
|
|
@source = nil
|
|
+ false
|
|
end
|
|
end
|
|
|
|
def match( pattern, cons=false )
|
|
- if cons
|
|
- md = @scanner.scan(pattern)
|
|
- else
|
|
- md = @scanner.check(pattern)
|
|
- end
|
|
- while md.nil? and @source
|
|
- begin
|
|
- @scanner << readline
|
|
- if cons
|
|
- md = @scanner.scan(pattern)
|
|
- else
|
|
- md = @scanner.check(pattern)
|
|
- end
|
|
- rescue
|
|
- @source = nil
|
|
+ read if @scanner.eos? && @source
|
|
+ while true
|
|
+ if cons
|
|
+ md = @scanner.scan(pattern)
|
|
+ else
|
|
+ md = @scanner.check(pattern)
|
|
end
|
|
+ break if md
|
|
+ return nil if pattern.is_a?(String) && pattern.bytesize <= @scanner.rest_size
|
|
+ return nil if @source.nil?
|
|
+ return nil unless read
|
|
end
|
|
|
|
md.nil? ? nil : @scanner
|
|
--
|
|
2.40.0
|